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Abstract 

This paper presents a comprehensive approach for model-based diagnosis which in- 
cludes proposals for characterizing and computing preferred diagnoses, assuming that the 
system description is augmented with a system structure (a directed graph explicating the 
interconnections between system components) . Specifically, wc first introduce the notion of 
a consequence, which is a syntactically unconstrained propositional sentence that character- 
izes all consistency-based diagnoses and show that standard characterizations of diagnoses, 
such as minimal conflicts, correspond to syntactic variations on a consequence. Second, 
we propose a new syntactic variation on the consequence known as negation normal form 
(NNF) and discuss its merits compared to standard variations. Third, we introduce a basic 
algorithm for computing consequences in NNF given a structured sj^stem description. We 
show that if the system structure does not contain cycles, then there is always a linear-size 
consequence in NNF which can be computed in linear time. For arbitrary system struc- 
tures, wc show a precise connection between the complexity of computing consequences 
and the topology of the underlying system structure. Finally, wc present an algorithm 
that enumerates the preferred diagnoses characterized by a consequence. The algorithm is 
shown to take hnear time in the size of the consequence if the preference criterion satisfies 
some general conditions. 

1. Introduction 

This paper presents a comprehensive approach for characterizing and computing preferred 
diagnoses when the system description is augmented with a system structure (Darwiche, 
1995). A system structure is a directed acyclic graph explicating the interconnections 
between system components. Adding a system structure to a classical system description 
(de Kleer, Mackworth, h Reiter, 1992) leads to what we call a structured system description, 
examples of which are shown in Figures 1 and 2. 

The most common approach for characterizing (and computing) diagnoses has been the 
use of conflicts and their derivatives such as kernel diagnoses (de Kleer &: Williams, 1987; 
Reiter, 1987; de Kleer et al., 1992). Moreover, the most common method for computing 
these characterizations has been the use of Assumption-Based Truth Maintenance Systems 
(ATMSs) (de Kleer, 1986; Reiter & de Kleer, 1987; Forbus & de Kleer, 1993). We will 
first explain the difficulties with such an approach and then describe the elements of our 
approach that address these difficulties. 
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The major problem with standard characterizations of diagnoses is that they tend to 
be exponential in size, which is largely due to their syntactic nature. Specifically, stan- 
dard characterizations of diagnoses correspond to the prime implicants/implicates of some 
propositional sentence. And the number of such implicants/implicates tend to be exponen- 
tial even for expressions that correspond to simple diagnosis problems. Computationally, 
this problem manifests as an ATMS label that has an exponential number of environments 
(Forbus & de Kleer, 1993). 

This difficulty has led to a body of research on "focusing" the ATMS, which attempts 
to control the size of ATMS labels (Provan, 1996; de Kleer, 1992; Forbus & de Kleer, 1988; 
Dressier & Farquar, 1990; Collins & DeCoste, 1991). Focusing is based on the following 
intuition. A label characterizes all diagnoses of a given problem, but one is rarely interested 
in all diagnoses; therefore, one rarely needs a complete label. Most often, one is inter- 
ested in diagnoses that satisfy some preference criterion, such as most-probable diagnoses. 
Therefore, one can use such a criterion to compute "focused" labels that are of reasonable 
size, yet are good enough to characterize the diagnoses of interest. However, although a 
standard framework exists for computing ATMS labels (Forbus & de Kleer, 1993), no such 
framework seems to exist for focusing. 

Another issue with standard frameworks for computing diagnoses (based on minimiz- 
ing propositional sentences) is that their computational complexity is not formally tied to 
properties of system descriptions that are easily accessible to engineers who would be con- 
structing these descriptions. An example of such a property is the topology of a system 
structure (component interconnectivity) . Providing computational complexity guarantees 
in terms of such properties can be extremely useful in practice, as our experience has shown. 
In real-world applications, one may have a choice of what failures to include in the scope of 
a diagnostic system, and therefore a choice of what aspects of a complex system to include 
in a system description. In such situations, it is very important to be able to assess the 
effectiveness of diagnosis algorithms by an intuitive examination of the resulting system 
description, such as examining the topology of a structured system description. The frame- 
work we shall develop in this paper addresses this particular point and has proven very 
useful in helping us engineer system descriptions on which our diagnostic algorithms are 
guaranteed to be effective. 

The approach we present in this paper is based on three main ideas, which address the 
problems mentioned above: 

1. Characterizing diagnoses using negation normal forms: We propose the no- 
tion of a consequence which is a syntactically unconstrained propositional sentence 
that characterizes all consistency-based diagnoses (Darwiche, 1995, 1997). We show 
that standard characterizations of diagnoses correspond to syntactic restrictions on a 
consequence. Specifically, minimal conflicts correspond to the prime implicates of a 
consequence and kernel diagnoses correspond to its prime implicants. We adopt a less 
restrictive syntax of consequences known as negation normal form (NNF) of which 
prime implicants/implicates are a special case (Barwise, 1977). Although we do not 
guarantee that our NNF representation of consequences is the most compact, we do 
offer some guarantees on this representation that cannot be offered with respect to 
standard representations. 
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2. Utilizing system structure in computing consequences: We introduce a basic 
algorithm for computing consequences in NNF, the complexity of which is determined 
by the topology of the system structure. We show that for tree system structures 
(those containing no undirected cycles), there is always a consequence in NNF that 
is linear in the number of nodes and arcs in the structure.^ Moreover, a standard 
characterization of diagnoses using minimal conflicts can be exponential in size for 
some of these system structures. For arbitrary system structures, we provide a precise 
relationship between the system structure, the size of a consequence, the time to 
compute it, and, hence, the difficulty of a diagnosis problem. 

3. A mechanism for computing minimal diagnoses:^ We show that if a conse- 
quence is in decomposable negation normal form, then one can extract the minimal 
diagnoses it characterizes in time linear in its size, as long as the minimality cri- 
terion satisfies some general conditions. The algorithm we propose for computing 
consequences is guaranteed to generate consequences in decomposable negation nor- 
mal form. Moreover, the conditions on a minimality criterion do admit the common 
criterion of minimum cardinality. 

Therefore, we are providing a paradigm for diagnostic reasoning with system structures, 
consequences, and minimality criteria as the key components. By using this paradigm, 
one is guaranteed some complexity results that are determined by the topology of the 
system structure. As we shall see, this approach is similar to the network-paradigm in the 
probabilistic and constraint-satisfaction literature where system structure is the key aspect 
that decides the difficulty of a reasoning problem. 

The literature contains a number of other proposals for importing this structure-based 
theme into model-based diagnosis (Dechter &; Dechter, 1996; Geffner & Pearl, 1987). Al- 
though these approaches appeal to similar underlying principles and lead to similar complex- 
ity results, some key differences exist between our approach and the previous ones. First, 
our formulation is based on symbolic logic, which is the tradition in model-based diagnosis, 
while the previous proposals have been based on constraints among multivalued variables. 
Second, the complexity of our algorithm depends not only on the system structure but also 
on the system observation. The stronger the system observation is, the better the com- 
plexity of our algorithm, leading to linear complexity in the extreme case (independently of 
the system structure). Finally, we separate the computation of minimal diagnoses into two 
phases: the characterization of all diagnoses using a consequence and then the extraction 
of minimal diagnoses from the consequence. This separation has a number of implications 
which are discussed in detail later in the paper. 

This paper is structured as follows. We introduce the notion of a consequence in Sec- 
tion 2, proving that it characterizes all consistency-based diagnoses, and showing its relation 
to some standard notions in the literature on model-based diagnosis. We then introduce 
three key theorems for constructing consequences in Section 3 and formalize the role of 
system structure in determining the complexity of computing consequences. In Section 4, 

1. Note, however, that the size of a consequence and the time to compute it is exponential in the size 
of families (each node and its parents in the device structure represent a family). In structure-based 
reasoning, it is typically assumed that the size of a family is small enough to be treated like a constant. 

2. In this paper, we use "minimal" and "preferred" interchangeably when referring to diagnoses. 
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Figure 1: A structured system description (SSD) of a digital circuit. An SSD has two parts: 
a directed acyclic graph and a set of component descriptions, each associated with 
a node in the graph. The formal definition of an SSD is given in Section 3.3. 



we turn to an algorithm for computing consequences when the system description is aug- 
mented with a system structure. The computational complexity of the presented algorithm 
is discussed at length in Section 5. We then provide an algorithm in Section 6 for extracting 
minimal diagnoses from a consequence. We finally close in Section 7 with some concluding 
remarks. Proofs of all theorems and lemmas are delegated to Appendix F. 

2. Characterizing Diagnoses 

We start in this section with a review of model-based diagnosis and then lead into the 
notion of a consequence for characterizing diagnoses. We show how standard characteri- 
zations of diagnoses, such as minimal conflicts, can be viewed as syntactic variations on 
the consequence and then introduce a new syntactic variation known as negation normal 
form. We also discuss our reasons for adopting this non-standard form for characterizing 
diagnoses. 

2.1 Model-Based Diagnosis 

In model-based diagnosis, we use the term system description to denote a system model 
(de Kleer et al., 1992; Reiter, 1987). Traditionally, a system description consists of a set of 
logical sentences A called a database and a set of distinguished symbols A = {okX , okY . .} 
called assurnables. Assumables represent the health of components and are initially assumed 
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Figure 2: A structured system description of a digital circuit. 



to be true. For example, in Figure 2, the assumables are okX,okY and the database A 
contains the four propositional sentences shown in the figure. 

A diagnosis problem emerges when assumables can no longer be justified. Specifically, 
given some sentence that represents an observed system behavior, the system is considered 
faulty if(p is inconsistent with AuA. In this case, one needs to relax some of the assumables 
(that is, replace instances of ok. with instances of -loA;.) in order to restore consistency. A 
particular relaxation of these assumables is called a diagnosis as long as it is consistent with 
the system description and observation. The cardinality of a diagnosis is the number of faults 
contained in the diagnosis. In Figure 2, a system observation C AD would indicate a failure. 
Moreover, there are three diagnoses in this case: okX A -ioA;F, -^okX A okY, -^okX A^okY , 
with cardinalities 1, 1 and 2, respectively. 

We have the following formal definition of a system description, which we adopt in the 
rest of this paper. The definition is a variation on the standard one provided in (Reiter, 
1987) and is preceded by some preliminary definitions. 

Definition 1 Lei S be a set of atomic propositions (atoms). An S -literal is a literal whose 
atom is in S. An S -sentence is a propositional sentence in which each literal is an S -literal. 
An S -instantiation is a conjunction of S-literals, one literal for each atom in S. 

For example, if S = {A, -B, C}, then there are eight S-instantiations: AAB AC, AABA^C, 
. . ., -^A A -^B A -iC. Moreover, A A ^B D C and Ad C are S-sentences, but A D D is not. 

Definition 2 (System Description) A system description is a triple (P, A, A), where P 
and A are sets of atomic propositions such that P n A = 0, and A is a set of propositional 

sentences constructed from atoms in P and A. Here, P is called the set of non- assumables; 
A is called the set of assumables; A is called a database. It is required that A be consistent 
with every A-instantiation. 

Given the notion of a system description, we can define the other two key terms in 
model-based diagnosis (de Kleer et al., 1992): 



169 



Darwiche 



Definition 3 (Observation) Given a system description (P, A, A), a system observation 
is a consistent conjunction of F-literals. 

Definition 4 (Diagnosis) Given a system description (P, A, A) and a system observation 
(f), a diagnosis is an A -instantiation that is consistent with A U {0}. 

It is worth mentioning here that the condition we imposed on database A in Definition 2 
is equivalent to saying that there are no diagnoses for the system observation true. Tliat is, 
if we have no observation about the system, then we cannot conclude anything about the 
health of its components.^ 

2.2 The Consequence 

An ultimate goal of diagnostic reasoning is to compute the minimal diagnoses (according 
to some criterion) for a given system description (P, A,A) and observation (f). Standard 
approaches to model-based diagnoses do this in conceptually two steps. First, they char- 
acterize the set of diagnoses using conflicts and then extract minimal diagnoses from this 
characterization. 

We will follow the same approach except that we will not use conflicts to characterize 
the set of diagnoses. Instead, we will adopt an equivalent but syntactically difli'erent charac- 
terization of diagnoses known as negation normal form (Barwise, 1977). Before we discuss 
this alternate characterization of diagnoses, however, we need to introduce the notion of a 
consequence which is very useful in putting the different characterizations of diagnoses in 
perspective. 

The consequence of an observation is defined formally below (Darwiche, 1995, 1997): 

Definition 5 (Consequence) Given a system description (P,A,A), the consequence of 
system observation (f), written Cons^{(f)), is a sentence satisfying the following properties: 

1. Cons^{(f)) is an A-sentence; 

2. AU{0} 1= Consi(0); 

3. For any A-sentence /3, A U {0} |= /3 only if Cons^{(f)) \= 

That is, the consequence of observation 4' is the logically strongest A-sentence which is 
entailed by the system description A and observation 0. 

When clear from the context, we drop the superscript A, the subscript A, or both, from 

the notation Cons^. 

3. This condition is intuitive if we arc not representing fault modes and are restricting ourselves to using 
the ok. assumables. However, if we use fault modes, such as stuck-at-0 and stuck-at-1, then the condition 
is not as intuitive and may even appear restrictive. Specifically, we may want to constrain these two 
assumables such that -i stuck-at-0 V -> stuck-at-1 is true. But introducing such a constraint would violate 
the condition we are imposing on A in Definition 2. We will make two points regarding this issue. First, 
one can represent constraints among assumables without violating the above condition; the details of this 
are explained in Appendix D. However, a better solution to this problem is to use multivalued variables 
instead of atomic propositions, which leads to a more convenient, but less standard framework, which is 
also discussed in Appendix D. 

4. The consequence of an observation is unique up to logical equivalence. 



170 



Model-Based Diagnosis using Structured-System Descriptions 



For example, if we observe that both C and D are true in the circuit of Figure 2, we 
conclude that one of the gates must be malfunctioning: under normal conditions. C being 
true implies that A is false, which further implies that D is false. Therefore, a conclusion one 
can make about assumables, having observed C A D, is that -^okX V -^okY . Moreover, this 
is the strongest conclusion that can be made given the system description and observation 
at hand. Formally, this means that -^okX V -^okY is the consequence of observation CAD, 
ConsiiC AD). 

That a consequence characterizes all consistency-based diagnoses is shown formally be- 
low: 

Theorem 1 (Characterization) Given a system description (P, A, A), an A-instantiation 
a is a diagnosis for system observation 4> according to Definition 4 iff a \= Cons^{4>). 

For example, the consequence -^okX \/ ^okY characterizes three diagnoses: -^okX A^okY , 
okX A -^okY and -^okX A okY . 

Standard characterizations of diagnoses can be viewed as syntactic restrictions on a con- 
sequence. Consider the following theorem first, which is a corollary of the results reported 
in (de Kleer et al., 1992): 

Theorem 2 Given a system description (P,A,A) and observation (j), we have the follow- 
ing: 

- A partial diagnosis is any A-sentence which is an implicant of Cons^{(f)). 

- A kernel diagnosis is any A-sentence which is a prime implicant of Cons^{(f)) . 

- A conflict is any A-sentence which is an implicate of Cons^{(f)). 

- A minimal confl/ict is any A-sentence which is a prime implicate of Co7is^{(])) . 

Prime implicates and implicants are standard notions but we include their definitions here 
for completeness: 

Definition 6 An implicant f3 of a sentence a is a satisfiable conjunction of literals which 
entails a. We say that (3 is a prime implicant of a if no subset of its literals satisfies this 
condition. An implicate P of a sentence a is a non-valid disjunction of literals which is 
entailed by a. We say that (3 is a prime implicate of a if no subset of its literals satisfies 
this condition. 

Note that each sentence is equivalent to the conjunction of its prime implicates. Therefore, 
the conjunction of minimal conflicts is nothing but a syntactic form of the consequence. 
Similarly, each sentence is equivalent to the disjunction of its prime implicants. Therefore, 
the disjunction of kernel diagnoses is also a syntactic form of the consequence.^ 

5. The notion of a consequence as we define it in this paper seems to correspond to notions that have 
appeared previously in the diagnostic literature, but these notions did not prove to be computationally 
influential. For example, Saraswat et al. define in (Saraswat, de Kleer, & Raiman, 1990) a maximally 
abstract diagnosis, which corresponds to our notion of a consequence. They also prove that a maximally 
abstract diagnosis characterizes the set of consistency-based diagnoses in the sense of Theorem 1 above. 
Similarly, Ayeb el al. define in (el Ayeb, Marquis, &; Rusinowitch, 1993) a deductive diagnosis, which is 
closely related to a consequence — the logically strongest deductive diagnosis is the consequence. In both 
cases, however, the proposed notions are not utilized computationally as we shall utilize consequences in 
the rest of this paper. 
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2.3 Negation Normal Form 

The minimal- conflict representation of a consequence is a minimized conjunctive normal 
form (CNF) and the kernel- diagnoses representation is a minimized disjunctive normal 
form (DNF). Both of these standard forms, however, are specials cases of negation normal 
form (NNF) (Barwise, 1977), which we adopt in this paper: 

Definition 7 (Negation Normal Form) A sentence a is in negation normal form (NNF) 
if and only if a is either a literal; a disjunction Vj c^ij or a conjunction Aj cti where each ai 
is in negation normal form. 

That is, in an NNF, the negation operator can only be applied to atoms — it cannot be 
applied to compound sentences. For example, -1(^4 A B) is not in NNF, but -^A V -^B is. 
An NNF will typically have nested conjunctions of disjunctions and nested disjunctions of 
conjunctions. 

The algorithm we present later will generate consequences in negation normal form. The 
generated consequences are not guaranteed to be the most compact, but we shall offer some 
guarantees on their sizes that do not hold with respect to consequences in standard forms. 

The consequences we shall generate are not only in negation normal form but are also 
decomposable: 

Definition 8 (Decomposable) A sentence a in NNF is decomposable if and only if no 
atoms are shared by any conjuncts in a.^ 

The sentence {{-^A\J ^okX) ^{-^C\J ^okY))\J {{-^okX\J A) ^{-^okY y C)) is in decomposable 
negation normal form: the negation operator appears only next to atoms in the sentence; 
and there are no common atoms between the conjuncts -^A\J ^okX and -iCV-ioAjF, neither 
there are any common atoms between the conjuncts -^okX V A and -^okY V C. 

The decomposability property is quite strong because it allows one to decompose cer- 
tain computations with respect to an NNF into smaller computations with respect to its 
subsentences. In Section 6, we shall see how this decomposability property will allow us 
to extract the minimal diagnoses characterized by a consequence by simply combining the 
minimal diagnoses characterized by its subsentences.^ 

Throughout the paper, we will be representing negation normal forms using directed 
acyclic graphs. This representation is detailed in Appendix A which also provides a number 
of operations for manipulating this graphical representation of negation normal forms. The 
operations defined in this appendix will be used in the pseudocode that we present later for 
generating consequences. 

6. Every DNF is also a decomposable NNF. But this is not necessarily true for CNFs. 

7. To appreciate this decomposability property, consider testing satisfiability as an example. It is easy to 
verify that such a test can be performed in time which is linear in the size of an NNF if the NNF is 
decomposable. In particular: 

1. if a is a literal, then a is satisfiablc; 

2. if a = ai V . . . V a„, then a is satisfiable iff some a, is satisfiable; 

3. if a = ai A . . . A a„, then a is satisfiable iff every ai is satisfiable. 

Case 3 above does not hold in general since each of two sentences may be satisfiable but their conjunction 
may not. However, if the two sentences do not share any atoms, then their satisfiability is enough to 
guarantee the satisfiability of their conjunction. 
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3. Computing Consequences: Some Fundamental Theorems 

Given some system observation cf) and some minimality criterion, our goal is to compute 
all minimal diagnoses of ^ according to this criterion. We will do this in two steps. First, 
we will compute the consequence of (f) in NNF. Second, we will extract from the computed 
consequence all the minimal diagnoses it characterizes. The second step will be addressed 
in Section 6. In this and the following section, we focus on the first step. 

Our strategy for computing consequences is to construct them from component conse- 
quences. Intuitively, a component consequence is the strongest conclusion one can draw 
about the health of a component given a particular state of its ports (inputs and outputs). 
We shall present two theorems. Decomposition and Case- Analysis, which are sufficient to 
construct any consequence by logically combining component consequences. We start by 
discussing component consequences first and then present the mentioned theorems. 

3.1 Component Consequences 

We assume that the sentences in database A, of a system description (P, A, A), are grouped 
into sets, each representing the description of some component in the system. Consider the 
system in Figure 2 for example, which has two components whose outputs are denoted 
by C and D. The database A for this system is viewed as the union of two component 
descriptions Ac and An where 

_ i AA okX D ^ 

and 

_ f AABAokY D 
^~ \ -^{AAB)A okY D 

Without loss of generality, we assume that a system component has one output and we use 
this output to identify the component.^ 

If Ao is a component description, then Cons^ {(f)) is called a component consequence 
whenever (j) is an instantiation of the ports of component O. Following are some component 
consequences with respect to Figure 2: 

Cons^^{AAC) = -nokX, 

Coiis^'^'iAA^C) = true, 

Cons^°{AAB AD) = true, 

Cons^°{AA^B AD) = -^okY . 

Intuitively, a component consequence is a strongest conclusion that can be made about the 
health of a component given an observation about that component ports. 

We will now provide a theorem that will be the basis for computing a component con- 
sequence in time linear in the number of clauses in the component description. But first, 
the following definition: 



8. If we want to model a component with n outputs, we model it as a set of n components, each with one 
output. 
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Definition 9 (Projection) The projection of an instantiation (clause) a on a set of atoms 
S, written as, is the conjunction (disjunction) of all S-literals in a. If A is a database, 
then CKA is the projection of a on the atoms appearing in A. 

For example, the projection of a = ^4 V -li? V C on atoms {A, B} is ^ V -iS. Moreover, the 
projection of a on database Ac above is ^4 V C. 

Theorem 3 (Component-Consequence) Let O be a component with inputs I and de- 
scription Ao (in clausal form). If (f) is an instantiation of atoms I U {O}, then 

Consi°{4>)= /\ a A. 
Consider the component description Ajj in Figure 2, which is shown in clausal form below: 



A 



D 



■^Ay y ^okY y D 
A V ^okY V -.D 
B V -^okY V -^D 



Consider also the observation cf> = A f\ -if? A D which represents a particular state of the 
component ports. For the first and second clauses a in Ajj, we have |= ap. For the third 
clause a, we have 4> |= -lap. Therefore, the component consequence in this case is -^okY 
which is the projection of third clause in A^ on the assumables. 

Any consequence computed by Theorem 3 is guaranteed to be in CNF. By simply 
converting it to DNF, we would also be converting it to decomposable NNF since each 
DNF is a decomposable NNF. Note that this conversion is exponential in the number of 
assumables appearing in the consequence. However, the number of such assumables is 
typically small enough to justify viewing this conversion as taking constant time. 

3.2 Decomposition and Case- Analysis 

We now provide two theorems which are sufficient for constructing any system consequence 
from component consequences. 

Theorem 4 (Decomposition) Let (P, A,0 U F) be a system description and let 4> be a 
system observation. If the atoms shared by Q and T all appear in 4>, then 

Consly^{4>) = Cons%{4>e) A (7ons^(0r). 

The theorem is intuitively saying that we can decompose a consequence with respect to a 
database QUT into two simpler consequences, one with respect to database © and the other 
with respect to database F, as long the atoms shared by 6 and F appear in the observation 

4>. 

Now, what if the observation does not contain all atoms that are shared between 6 
and F? We can still decompose the computation of a consequence Cons*^^ {(f)) in such a 
case, but at the expense of performing a case analysis on the shared atoms between B and 
F. For this we need the following key theorem: 
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Theorem 5 (Case- Analysis) Let (P,A,A) be a system description, 4> he a system ob- 
servation, and let S be a subset of P. Then 

Cons^{4>) = \J Cons^{(f) A a), 

a 

where a ranges over all instantiations of atoms S that are consistent with (f). 

Case- Analysis is typically used to set the stage for Decomposition. That is, if the observation 
4> does not contain all atoms that are shared between and F, we would perform a case 
analysis on shared atoms not appearing in cf) and then apply Decomposition to each of the 
resulting cases. The application of Case- Analysis followed by Decomposition is summarized 
below: 

Corollary 1 (Intersection) Let (P, A,© U F) be a system description, 4> he a system 
observation, and let S contain all atoms shared by © and F. // S contains no assumables, 
then 

Cons'f^^{4>) = \J Cons%{a A 0e) A Cons\{a A (jiv), 

a 

where a ranges over all instantiations of S that are consistent with 4>. 

Using Decomposition and Case- Analysis one can always construct a system consequence 
from component consequences as follows. We partition the component descriptions in A 
into two subsets, and F. We identify the common atoms between Q and F and then 
apply the Intersection Corollary. This allows us to decompose a consequence Cons^{.) into 
a number of smaller consequences of the form Cons®{.) and Cons^{.), where each of O and 
F are smaller than A. We then apply the same procedure, recursively, on each of Cons®{.) 
and Cons^{.) until we reach component consequences which can be computed using the 
Component-Consequence Theorem. 

This procedure will always work. However, it does not guarantee the size of the resulting 
consequence. Specifically, the procedure does not tell us how to partition a database A into 
two databases and F. Depending on this choice, the number of common atoms between 
the partitioned databases will vary; hence, leading to a better or worse consequence size.^ 

As we shall see next, a system structure can play a key role in making this partitioning 
choice. A system structure is a directed acyclic graph that explicates the interconnections 
between system components. When a system description is augmented with a system 
structure, we refer to the result as a structured system description. Structured system 
descriptions are defined formally in the following section, but first the following example on 
applying the Intersection Corollary. 

Consider Figure 2 where A = Ac U A^i, and let us compute the consequence of the 
system observation (f) = C A D, that is, Cons^{(f)). We need to construct this consequence 
from component consequences of the form Cons^'^ {.) and Cons^'^ {.). We cannot do this 

9. Realize that the expansion suggested by Case- Analysis is exponential in the number of atoms on which 
we do the case analysis. Therefore, it is important to choose partitions that will minimize the common 
atoms between the partitioned databases. 
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immediately, however, since the observation cf) does not mention atom A which is shared by 
Ac and A^;. Therefore, we can use the Intersection Corollary with S = {A}: 

Substituting for cf)/^^ = C and (f)A^ = D, we get: 

Cons^iCAD) = [(7ons^« {A AC) A Cons^^ (A AD)] V [Cons^^ {-^AAC) A Cons^'' {-^AAD)]. 

The resulting expression contains only component consequences, which we assume are pre- 
computed. Substituting for the values of these consequences we get: 

Cons^iC AD) = [-nokX A true] V [true A -^okY] = -^okX V -^okY. 

Note that the Intersection Corollary constructs only conjunctions and disjunctions (no 
negations). Therefore, if component consequences are in NNF, then any system consequence 
constructed using the corollary must also be in NNF. Moreover, if we ensure that component 
descriptions do not share assumables, then we are guaranteed that the resulting NNF is 
also decomposable. This follows because the Intersection Corollary will apply the conjoin 
operator only to consequences which correspond to disjoint subsystems, that is, collections 
of disjoint component descriptions. 

3.3 Structured System Descriptions 

We now turn to the formal definition of a structured system description. We start with 
the definition of a component description (with respect to some assumables A and non- 
assumables P). 

Definition 10 (Component Description) A component description is a triple (I, O, Aq) 
where I is a set of non-assumahles: O is a non-assumable such that O I; and Aq is a 
set of propositional sentences satisfying the following conditions: 

1. Ao can only mention assumables in A or non-assumables in I U {O}. 

2. Every instantiation o/lU A is consistent with Aq- 

The second condition above prohibits a component description from specifying a direct 
relationship between the inputs of a component and is typically self imposed. 

We are now ready to provide the formal definition of a structured system description. 
We use to denote the parents of node P in a directed acyclic graph Q. 

Definition 11 (Structured System Description) A structured system description (SSD) 
is a tuple (P, A, Q, A), where P and A are sets of atomic propositions such that P fl A = 0; 
Q is a directed acyclic graph over nodes P; A is a function that maps each node P in 
P into a set of propositional sentences Ap such that {Qp, P, Ap) is a component descrip- 
tion. Here, P is called the set of non-assumables: A is called the set of assumables: Q is 
called the system structure. It is required that no assumables be shared between component 
descriptions. 

10. Assuming that component descriptions are themselves decomposable. 
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We shall overload the meaning of A and use it to denote UpeP (the union of all com- 
ponent descriptions). 

The last requirement in Definition 11 ensures the decomposability of generated conse- 
quences and does not limit the expressive power of structured system descriptions. Specifi- 
cally, one can always ensure that component descriptions do not share assumables, but at 
the expense of adding auxiliary nodes to the system structure. See Appendix C for details. 

Each structured system description (P, A, Q, A) induces a system description (P, A, A) 
according to Definition 2: 

Theorem 6 //(P,A,^,A) is a structured system description, then (P,A,A) is a system 
description. 

That is, if (P, A,^,A) is a structured system description, then every A-instantiation is 
consistent with database A. This also means that the database of a structured system 
description is guaranteed to be consistent by construction. Note that this global consistency 
is guaranteed given only local conditions on component descriptions. 

We close this section by noting that a structured system description as defined here 
is the diagnosis special-case of a symbolic causal network which we introduced elsewhere 
(Darwiche & Pearl, 1994). 

4. Structure-Based Computation of Consequences 

A main message of the previous section is that composing a system consequence from 
component consequences is straightforward, as long as we are not concerned about the 
size of the resulting consequence. Specifically, given that component consequences are 
precomputed, one can compute a system consequence by successive applications of the 
Intersection Corollary. In practice, however, the size of the resulting consequence is a key 
concern because it affects the time needed to generate the consequence, the space needed 
to store it, and the time needed to extract from it the minimal diagnoses. 

Therefore, we shall present in this section a method for composing a system conse- 
quence from component consequences while trying to minimize the size of the resulting 
consequence. This method rests on partitioning component descriptions using a jointree: 
a tree of hypernodes that results from graphically transforming the system structure. In 
particular, 

- we present in Section 4.1 jointrees and show how they can be used to partition com- 
ponent descriptions; 

- we then present in Section 4.2 an algorithm for computing component consequences 
which are the building blocks of system consequences; and 

- finally, we present in Section 4.3 an algorithm for computing system consequences 
using component consequences and a jointree. 

4.1 Partitioning Component Descriptions using Jointrees 

A jointree T is constructed for a given directed acyclic graph Q. The nodes of a jointree 
are called clusters or cliques and they represent sets of nodes in the graph Q. Figure 5 
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Figure 3: A structured system description of a digital circuit. 
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Figure 4: (a) a tree T; (b) a subtree Tss; (c) the partitioning of T around clique C5. 

contains a jointree for the system structure (directed acyclic graph) in Figure 3. The sepset 
of any adjacent cliques Cj and Cj in a jointree is defined as their intersection Ci fl Cj and is 
denoted by Sij. 

There are two conditions that must be satisfied by a jointree: 

1. The ports of each component must belong to some clique in the jointree. 

2. If a node belongs to two cliques, it must also belong to every clique on the path 
between them. This is called the jointree property. 

We partition component descriptions using a jointree by assigning each component to 
a clique that contains the ports of that component. We will use COMPONENTS_Of(C) to 
denote the components assigned to clique C and refer to the function COMPONENTS_OF as a 
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component assignment for {G, T). Note that a component assignment is not unique. The ht- 
erature on probabilistic reasoning contains heuristics for choosing a component assignment, 
but they are outside the scope of this paper (Huang & Darwiche, 1996; Jensen, Lauritzen, 

6 Olesen, 1990). 

To see how jointrees are used to generate consequences, we need the following notation. 
Let T be a jointree and let Tij denote the subtree that continues to include clique Ci after 
deleting the arc Ci-Cj from T. For example. Figure 4(a) depicts a jointree T and Figure 4(b) 
depicts the subtree 753. 

Now, each clique Ci in a jointree partitions the jointree into clique Ci and a number of 
subtrees Tji where Cj is a neighbor of Cj. Moreover, by the jointree property, clique Ci is 
guaranteed to contain all atoms that are shared by any two subtrees Tji- Therefore, each 
clique Ci can be viewed as partitioning components into those assigned to Ci and those 
assigned to subtrees Tji- For example, one can partition the jointree in Figure 4(a) into 
clique C5 and the subtrees T35, T45, Tes and 775 as shown in Figure 4(c). Moreover, any atom 
that is shared by two subtrees Txs and 7^5 must belong to clique C5. Therefore, the atoms 
of this clique make a very good candidate for being the set S in the Intersection Corollary; 
that is, the atoms on which to perform case analysis. If we apply the Intersection Corollary 
with this choice of S, we will decompose a system consequence (one with respect to the 
components of T) into a number of consequences some of which are with respect to the 
components assigned to clique C5, the others are with respect to the components assigned 
to subtrees 735, 745, 765 and 775. Each one of the resulting consequences is with respect 
to a smaller system, and the decomposition process can continue recursively until we reach 
boundary conditions, where each consequence is with respect to the components assigned to 
a clique. Clique consequences can be computed easily from the consequences of components 
assigned to them. More details on this will be given in Section 4.3 where we provide the 
pseudocode for an algorithm based on this decomposition process. 

We shall close this section by explaining why the partition induced by a jointree is better 
than an arbitrary one. The answer is simple: the optimization criterion for constructing 
jointrees attempts to minimize the size of cliques. Therefore, the optimization criterion for 
constructing jointrees attempts to minimize the atoms on which to perform case analysis 
when applying the Intersection Corollary, therefore, attempting to minimize the size of 
the resulting consequence. For a detailed, self-contained discussion on the construction of 
jointrees, the reader is referred to (Huang &: Darwiche, 1996). 

4.2 Computing Component Consequences 

In addition to constructing a jointree and a component assignment, we must compute com- 
ponent consequences before we can compute system consequences. In particular, for a given 
component description {1,0, Aq), we need to compute Cons (7) for every instantiation 

7 of the atoms I U {O}. That is, for every possible state 7 of the component ports, I U {O}, 
we must compute and store the strongest conclusion. Cons (7), we can draw about the 
health of that component. 

This is accomplished by the algorithm in Figure 6 which is a direct implementation of 
the Component-Consequence Theorem in Section 3.1. The algorithm assumes that each 
clause a in the component description Aq is decomposed into two parts, one containing 
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components_of(Ci) = {A,B,D}; components_of(C2) = {C}; and components_of(C3) = {E}. 



Figure 5: On the left, a jointree for the structured system description of Figure 3. On the 
right, the same jointree with sepsets shown on each arc. The table shows the 
component descriptions, Aj, that are assigned to clique Cj. 



only assumables, a a, and another containing only non-assumables, «p. It returns an array 
whose indices correspond to instantiations of the atoms I U {0} and whose entries are 
component consequences. 

Please note that we use integers to represent instantiations, a technique that we detail 
in Appendix B. This appendix provides two key functions, one for computing a unique 
index for each instantiation and another for generating all instantiations of a given set of 
atoms. These functions are used in the pseudocode of Figure 6. 

The soundness of the algorithm in Figure 6 is given below: 

Theorem 7 The array CONSEQUENCESo computed in Figure 6 satisfies the following prop- 
erty: 

CONSEQUENCESo[/] = ConS^° 
for every instantiation 7 0/ I U {0} and its corresponding index I. 

Figure 7 provides a detailed example showing how the algorithm of Figure 6 is used to 
compute the consequences of a 2-input or-gate. The number of computed consequences in 
this example is eight, which is 2^+^, since the component has two inputs and one output. 
Each component consequence, Cons^^(7), in this example is equivalent to either true or 
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Computing Component Consequences 

Input: A component description {1,0, Aq)- The database Aq consists of pairs (q;p,q;a) where 
ap is a set of P-literals and qa is a set of A-literals. The pair (ap, qa) represents the disjunction 
of Hterals in ap U a a ■ 

Output: An array whose indices are 0, . . . , 2l'l+^ — 1 and whose elements are nodes in an NNF- 
graph. 

Pseudocode: 

component-consequence 

01. for every 7 in generate_instantiations(I U {O}) do 

02. assert(7) 

03. Z-s-index(I U {O}) 

04. CONSEQUENCESo[Z]-<-NEW_AND_NODE() 

05. for every pair (ap.aA) in Ao do 

06. when EMPTY?(q;p ("17) do {the clause ap is inconsistent with the instantiation 7} 

07. ADD_CHILD(CONSEQUENCESo[/], CREAT_DISJUNCTION(aA)) 

08. retract(7) 

09. return CONSEQUENCESo 

GREAT _DISJUNCTIOn(L) 

01. _Dwj-(-NEW_OR_NODE() 

02. for every hteral {V,v) in L do 

03. ADD_CHILD(D«sj, NEW_LITERAL_NODE(y, v)) 

04. return Disj 



Figure 6: An algorithm for computing component consequences. 

-^okY depending on whether the instantiation 7 of atoms {A,B,D} is consistent with the 
expected behavior of the or-gate. 

The following theorem shows that computing component consequences is exponential in 
the number of component ports, but is linear in the number of clauses in its description. 

Theorem 8 The time and space complexity of C0MP0NENT_C0NSEQUENCE is 0(sn2") 
where s is the number of pairs in database Aq and n is the number of atoms m I U {O}. 

Therefore, as long as the number of inputs to a component is small enough, computing 
component consequences can be considered to take constant time. Given this complexity 
result, one should attempt to minimize the number of inputs per component. Bear in mind 
that components are conceptual constructs and need not strictly correspond to physical 
components in the system. For example, one may opt to view an n-input and-gate as 
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a number of cascaded 2-input and-gates. This will reduce the complexity of computing 
component consequences from 0(sn2") to 0{s); a significant improvement.^^ 

4.3 Computing System Consequences 

Given that we have 

- constructed a jointree; 

- chosen a component assignment; 

- computed component consequences; 

we are now ready to compute system consequences. All we need is to apply the Intersection 
Corollary successively until we decompose the system consequence into a number of com- 
ponent consequences which can be simply looked-up from the arrays that are computed by 
the algorithm of Figure 6. 

To apply the Intersection Corollary, we need a pivot clique which is given as input 
to the algorithm in Figure 8. The algorithm has two main functions, one for computing a 
consequence with respect to the components assigned to a clique and another for computing 
a consequence with respect to the components assigned to a subtree. Let us consider the 
function for computing clique consequences first: 

Lemma 1 When CLiQUE_CONSEQUENCE(Cj) is called, atoms Ci are guaranteed to be in- 
stantiated to some a and the call returns a sentence equivalent to Cons^ {(f)i A a) where 

- Aj is the union of all component descriptions assigned to clique Ci; and 

- 4>i is the projection of system observation 4> on the atoms in clique Ci. 

This function is simple; it directly applies the Decomposition theorem to decompose a clique 
consequence into the conjunction of component consequences (which are provided as input 

to the algorithm). 

The second function of the algorithm computes a consequence with respect to the com- 
ponents assigned to a subtree Tij: 

Lemma 2 When SUBTREE_C0NSEQUENCE(C2, Cj) is called, atoms Sij are guaranteed to be 
instantiated to some (3 and the call returns a sentence equivalent to Cons^^ {(f)ij A (3) where 

- Aij is the union of all component descriptions assigned to cliques in subtree Tij; and 

- 4>ij is the projection of system observation 4> on the atoms in subtree Tij. 

This function is more involved since it calls CLIQUE_CONSEQUENCE and itself recursively. 

Note that the results returned by this function are cached since the function may be 
called more than once with the same arguments and the same instantiation of sepset Sij.^"^ 

11. In such a case, however, the technique that we discuss in Appendix C must be used to ensure that the 
descriptions of these 2-input components do not share assumables. 

12. This is because the same instantiation of a sepset Sij may appear as part of many instantiations of clique 
Ci. 
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An Example of Algorithm componbnt_consequbncb 



Consider the following component description, {{A, B}, D, Ad), of a 2-input or-gate where 

{-.^A-.S A okY D^D ^ 
AAokYDD I. 
BAokYDD J 

Algorithm C0MP0NBNT_C0NSEQUBNCB requires the following representation oi Ad- 

( {{{A,1),{B,1),{D,0)}, {{okY,0)}) ] 
{ {{{A,0),{D,l)},{{okY,0)}) \, 
[ {{{B,0),{D,l)},{{okY,0)}) J 

where each clause is represented as a pair of clauses, one containing only non-assumables and the other 
containing only assumables. Calling C0MP0NENT_C0NSBQUENCE on this component description returns 
the following array (first two columns): 



index 



CONSEQUBNCBS23 



7 



AA^B A -.D 



^AAB A^D 



AAB A^D 



^AA^B AD 



AA^B AD 



^AAB AD 



AAB AD 



true 



■^okY 



^okY 



^okY 



^okY 



true 



true 



true 



Figure 7: Computing consequenes of an or-gate with inputs {A^B^ and output D. 
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Computing System Consequences 



Input: 

- (P,A,Q,A): a structured system description; 

- 0: an instantiation of some atoms E C P; 

- T: a join tree for system structure Q; 

- COMPONENTS_OF: a component assignment for {G,T); 

- CONSEQUENCESo: a component consequence array for each component O — computed 
using the algorithm in Figure 6; 

- Cz- an arbitrary cluster in the tree T (pivot cluster). 

Output: An NNF-graph over assumables A. 

Data Structures: For each neighboring clusters {Cx,Cy) in the join tree T: 

- a sepset S^y = CxCiCy] and 

- a hash table HASH.TABLEj^y which contains NNF-graph nodes. 

The keys of HASH_TABLEa;y are 0, . . . ,2^^'^^ — 1. HASH_TABLEa:j,[/] is the element having I as a 
key. 

Pseudocode: 

SYSTEM_CONSEQUENCE 

01. add cluster Co = as a neighbor of Cz 

02. ASSERT((/)) 

03. subtree_consequence(C2,Co) 



SUBTREE_CONSEQUENCE(Cj, Cj) 

00. Z-«-INDEX(iSjj) 

01. /i<-HASH_TABLEjj[Z]; if /i ^ NIL then return fi 

02. /)«sj<-NEW_OR_NODE() 

03. for every a in GENERATE_iNSTANTiATiONS(Cj) do 

04. assert(q;) 

05. Conj <-new_and_node() 

06. ADD_CHILD (Conj, CLIQUE_CONSEQUENCE(Cj)) 

07. for every neighboring cluster Ck of C, where k ^ j do 

08. ADD_CHTT,D(Conj, SUBTREE_CONSEQUENCE(Ci;, Cj)) 

09. ADD_CHILD(D«sj, Conj) 

10. RETRACT(a) 

11. HASH_TABLE,j £)«sj; return Disj 

cluster_consequence(C,) 

01. Conj ^new_and_node() 

02. for every component O (with inputs I) in COMPONENTS_OF(Cj) do 

03. ADD_CHILD (Corjj, CONSEQUENCESo [lNDEX(I U {O})]) 

04. return Conj 



Figure 8: An algorithm for computing a system consequence given a system observation. 
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In such a case, the resiUt is looked up from the cache instead of being recomputed. This 
caching of results explains why the generated consequence has the form of a directed acyclic 
graph instead of a tree. 

Given the function subtree_CONSEQUENCE, one can compute the system consequence 
by adding a dummy clique Co = as a neighbor to the pivot clique to generate an extended 
jointree T'. The subtree T^o of this extended tree T' will be nothing but the original jointree 
T. The call to SUBTREE_CONSEQUENCE on line 03 of SYSTEM_CONSEQUENCE will then 
return the system consequence. This is stated more explicitly by the following theorem: 

Theorem 9 The function SYSTEM_CONSEQUENCE in Figure 8 will always terminate re- 
turning a sentence which is equivalent to Cons^{(j)). 

Figure 9 contains a detailed example showing how the algorithm in Figure 8 is used to 
generate a system consequence for a given observation and structured system description. 
Note that the computed consequence in this example is equivalent to -^okX V -'okZ. 

The consequences computed by the algorithm in Figure 8 are guaranteed to be in de- 
composable negation normal form as long as component consequences are in that form. 

Theorem 10 In Figure 8, if each component consequence CONSEQUENCESo[-] is in de- 
composable NNF, the sentence returned by the function SYSTEM_CONSEQUENCE will be in 
decomposable NNF. 

Note that the consequences computed by C0MP0NENT_C0NSEQUENCE in Figure 6 are not 
guaranteed to be in decomposable NNF. However, they are guaranteed to be in CNF. We 
can simply transform them into DNF and, hence, into decomposable NNF. 

5. The Complexity of Computing Consequences 

We now turn to the discussion of computational complexity: 

Theorem 11 Consider the algorithm system_CONSEQUENCE in Figure 8. The time and 
space complexity of this algorithm is 

|C|2l^\^l), 

Cer 

where C denotes a clique in the jointree T. 

The complexity is then linear in the number of cliques, but exponential in their sizes. 

Consider all jointrees of a given graph Q, and select the tree T with the smallest maximal 
clique. The size of the maximal clique in such a tree minus one is known as the width w* 
of graph Q (Dechter, 1992; Dechter & Pearl, 1989). Theorem 11 is then saying that the 
computational complexity of a consequence is exponential in the width w* of the given 
system structure Q. But this complexity result is for the worst case — the best and average 
cases are different. 
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An Example of Algorithm SYSTEM_CONSEQUENCE 



Consider the jointree and component assignment given in Figure 5. Consider also the following 
component consequence arrays: 



7 




CONSEQUENCESc 
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C A D A E 
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CaUing SYSTEM_CONSEQUENCE with the observation (j) as {{A, 1), {E, 1)} and the pivot cluster 
as Ci , we get the following consequence (in decomposable negation normal form) : 



observation: (j) ^ = {(A,l), (E,l)} 
consequence: not okX or not okZ 




and 



and 



and 



and 



true true Inie 







and or5 and or2 and or5 

AAA A 



true true (okY.O) 



and 




true true true true (okY,0) 



and or3 or4 and or6 



and 



and 



or? 



and 



and 



and 



and 



The resulting consequence can be simplified significantly and easily. One way to realize this 
simplification is to use an optimized version of ADD-CHILD in Appendix A which performs the 
necessary simplifications as it adds a child to a node. 



Figure 9: An example showing the application of algorithm SYSTEM_CONSEQUENCE. We 
have numbered or-nodes in the consequence to facilitate the depiction of structure 
sharing without having to draw cross arcs. 
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5.1 The Effect of Observations 

According to Theorem 11, the time and space complexity are exponential not in the size of 
cliques, but in the size of cliques after we have removed from them atoms E that appear 
in the observation (f). At one extreme, the observation (f) is true (E = 0) in which case 
the algorithm is exponential in the size of the maximal clique. At another extreme, the 
observation is an instantiation of all non-assumables (E = P). In such a case, the 
algorithm does not depend on the size of cliques and is only linear in the number of such 
cliques. This can be seen by observing line 02 of algorithm SYSTEM_CONSEQUENCE where 
all atoms in the system structure are instantiated. This means that the result of any call 
to GENERATEJNSTANTIATIONS will return a singleton. Hence, every constructed or-node 
on line 02 of SUBTREE_CONSEQUENCE will have exactly one child. Moreover, the number 
of calls to SUBTREE-CONSEQUENCE will be exactly the number of edges in the jointree and 
the number of calls to CLlQUE_CONSEQUENCE will be exactly the number of cliques in the 
jointree. 

The effect of observations on the computational complexity of consequences is illustrated 
in Figure 10 which depicts four consequences with respect to the system in Figure 3. The 
consequences correspond to the following observations: 

- 4)1= A AE; 

- (1)2 = AAB AE; 

- (f)s = A A B A C A E; and 

- 4)4 = AAB AC A^D AE. 

Clearly, observation 0^ is stronger than observation ^j-i. Interestingly enough, the conse- 
quence of (f)i is a subset of the consequence of 4'i-i- Moreover, the subset is obtained by 
eliminating some children of or-nodes, which corresponds to considering fewer cases when 
applying Case- Analysis. 

Because of this dependence on observations, the presented algorithm for generating 
consequences will do very well in applications where the system is strongly observed. We 
have experienced this ourselves when we applied this framework to discrete event systems 
which are typically strongly observed (Darwiche & Provan, 1996, 1997). Even when the 
system is not strongly observed. Theorem 11 is useful in deciding which observations would 
be most rewarding computationally. 

A more dramatic effect of the system observation on complexity is obtained by cutting 
outgoing arcs of observed nodes from the system structure. This technique may actually 
reduce the w* of the given system structure and is illustrated in Appendix E, together with 
some experimental results that show its effectiveness. 

5.2 Tree System Structures 

An important special case of the presented algorithm is when the system structure Q that 
we start with is already a tree — that is, only one undirected path exists between any two 
nodes in Q (see Figure 11). In this case, one can always construct a jointree in which (a) 
each clique contains only a node and its parents in Q; and (b) each sepset contains only one 

13. This observation is the basis of an approach for compiling devices into parameterized NNFs, which is 
described in (Darwiche, 1998a). 
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Figure 10: Four consequences corresponding to four different observations. 
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Figure 11: On the left, a system structure with no cycles. On the right, a corresponding 
jointree. 



node. In particular, for each node N in the system structure Q, we construct a clique which 
contains the family of N (that is, {N} U Qn)- We then connect the clique corresponding to 
N with the cliques corresponding its neighbors, eliminating any cliques that are contained 
in their neighbors. The result is guaranteed to be a jointree. Figure 11 contains a jointree 
of a tree system structure. More details on this construction can be found in (Shachter, 
Andersen, & Szolovits, 1994). 

Therefore, when the system structure is a tree, the time and space complexity of com- 
puting a consequence are exponential only in the size of a family. Because of this result, one 
should attempt to construct tree system structures whenever possible. Moreover, special 
care must be exercised to avoid large families in such structures. For example, one should 
model n-input gates as a cascaded set of 2-input gates whenever possible. If a tree system 
structure is not possible, one should attempt to engineer the structure so as to minimize 
the size of cliques in its corresponding jointrees. 

Although tree system structures are computationally easy, non-tree structures are not 
necessarily hard (El Fattah & Dechter, 1995). In fact, it is possible to obtain a linear 
complexity even if the system structure is not a tree. Consider for example n-bit adders 
which are constructed form cascading full adders. These adders have non-tree structures, 
but it is shown in (El Fattah &; Dechter, 1995) that they have jointrees which grow linearly 
in the number of bits n. Therefore, the consequences corresponding to these systems grow 
linearly in the number of bits n. 

6. Extracting Minimal Diagnoses 

In this section, we present a method for extracting the minimal diagnoses characterized 
by a consequence. In particular, if the consequence is in decomposable NNF, and if the 
minimality criterion satisfies some general conditions, the algorithm we shall present will 
extract minimal diagnoses in time linear in the size of the consequence. 

We start first by stating the required conditions on a minimality criterion and then 
introduce the algorithm with some examples. We finally state the algorithm formally, prove 
its soundness, and its linear computational complexity. 
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6.1 Cost Functions 

Intuitively, minimal diagnoses are those diagnoses which are considered most plausible. One 
may have many instantiations of assumables that are consistent with the system description 
and observation, but only a few of these may be considered most plausible. A common 
minimality criterion is the one based on diagnosis cardinality. That is, we simply count the 
number of -loA;. literals in a diagnosis (cardinality of a diagnosis) and consider the diagnoses 
with minimal cardinality to be the most plausible. 

We adopt the view that each diagnosis has a cost, and minimal diagnoses are those with 
a minimal cost. Moreover, we assume that the cost of a diagnosis is obtained by adding 
up the costs of its literals. This is all captured by the following notion of a cost function, 
which assigns costs to literals and instantiations. 

Definition 12 A cost function is a tuple (A,S,©,J^) where 

• A is a set of atoms; 

• Ti is a set of costs; 

• ® is a binary operation (addition) on S which satisfies the following properties: 

1. ® is commutative, associative, and has a zero element 0. 

2. For all a and b in S, either a® c = b or b® c = a for some unique c. 

3. a © 6 = implies a = and 6 = 0. 

• JF maps each A-literal into a cost in E, where each literal or its negation has cost 0. 
The function is extended to instantiations as follows: 

J^ih A...Aln)= J^ih) © ... © J^{ln)- 

The conditions we impose on the cost addition function are justified as follows. The first 
condition is to be expected of addition operations. The second condition ensures that costs 
are totally ordered; that is, for any two costs, one can be obtained from the other by adding 
some cost to it. The third condition ensures that there are no negative costs so that we 
cannot reduce the cost of an instantiation by adding more literals. 
A cost function induces an ordering on costs as follows: 

Definition 13 The ordering relation <0 induced by a cost function (A,E,©,.7^) is defined 
as follows: a<@b iff a® c = b for some c. If a<@b and a ^ b, we write a<@b. 

That is, if 5 is obtained by adding some cost to a, then b is greater than or equal to 
a. Moreover, this induced ordering is guaranteed to be a total ordering by virtue of the 
properties we imposed on cost addition: 

Theorem 12 The relation induced by a cost function (A,S,©,.7^) is a total ordering 
on S. 
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An example of a cost function satisfying the above conditions is ({0, 1, 2, . . .}, -|-, k), 
where the cost of a literal I, k{1), is the order-of-magnitude of its probability.^^ 

Another example of a cost function is the minimum cardinality criterion, which is widely 
used in the diagnosis literature. The criterion is specified by ({0, 1,2,...}, -|-, Card), where 
Card{l) = 1 if the literal / designate a fault and Card{l) = otherwise. 

Note that the triple ([0, 1], *, Pr), where the cost of a literal is its probability, is not a 
legitimate cost function according to the conditions stated above. The cost addition function 
* has 1 as its zero element in this case. Therefore, the condition Pr{l) = 1 or Pr{^l) = 1 
does not necessarily hold for each literal Note also that the set-inclusion version of 
diagnosis minimality is not covered by our definition of cost functions; this notion, however, 
is problematic when fault models are included (de Kleer et al., 1992). 

The reason we require each literal or its negation to have cost is to be able to charac- 
terize minimal diagnoses using minimal partial diagnoses, a property which is not possible 
in general. For example, let « = «! V . . . V a„ be a disjunction of partial diagnoses and 
let a' be the result of removing from a all partial diagnoses that do not have minimal 
cost. The above requirement guarantees that a and a' characterize the same set of minimal 
diagnoses. 

6.2 The Extraction Algorithm 

The goal of the algorithm that we shall present next is to compute diagnoses that are 
characterized by a consequence t and that have a minimal cost compared to other diagnoses 
characterized by r. 

Definition 14 (Minimal Instantiations) Let (A,E,©,jr) be a cost function and let r 
be an A-sentence. A minimal instantiation of t with respect to {A,'S,®,T) is an A- 
instantiation a such that 

1. a\= t; and 

2. if P is an A-instantiation such that /3 |= t, then J- [a) J-{P). 
The minimal instantiations of t are denoted by MinInst{T). 

Consider the consequence r in Figure 9 for an example. This consequence is equivalent 
to -tokX V -tokZ and, therefore, characterizes six diagnoses: 

14. The order-of-magnitude of a probability p, written k(p), is an integer such that p/e"'*'' is finite but not 
infinitesimal for an infinitesimal e (Darwiche &; Goldszmidt, 1994; Goldszmidt &; Pearl, 1992; Spohn, 
1987). That is, < p < e"'*"'- Note that if a and 6 are probabilistically independent, then 
K{Pr{ab)) = K{Pr{a)) + K{Pr{b)). 

15. Consider the following counterexample with respect to the triple ([0, 1], *, Pr) which does not satisfy the 
above requirement. Suppose that the assumables are okX , okY , okZ and okW and their probabilities 
are .9, .9, .7 and .9, respectively. Suppose further that a = okX A okY A -lokW V -lokY A okZ . The first 
partial diagnosis is less probable than the second and, therefore, pruning gives a = ->okY A okZ. Note, 
however, that a characterizes six diagnoses, two of which okX A okY A okZ A^okW and okX A^okY A 
okZ A okW are most probable. On the other hand, a' characterizes four diagnoses, only one of which 
okX A ->okY A okZ A ok W is most probable. Therefore, a and a do not characterize the same set of 
most probable diagnoses. 
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1. okX A okY A ^okZ 

2. okX A ^okY A ^okZ 

3. -^okX A okY A oA;2 

4. ^okX A oA;r A ^oA;Z 

5. -^okX A ^oA;F A okZ 

6. ^oA;X A ^oA;r A ^okZ. 

According to the minimum-cardinality cost function where A = {okX, okY, okZ}, the costs 
of these diagnoses are 1, 2, 1, 2, 2, 3, respectively. Therefore, the consequence characterizes 
only two minimal diagnoses: 

1. okX A okY A -^okZ 
3. -^okX A okY A okZ. 

Hence, 

MinInst{T) = {okX A oA;^ A ^oA;Z, ^oA;X A okY A oA;2}. 

The purpose of our algorithm is to compute this set of diagnoses given consequence r. 

As it turns out, extracting minimal diagnoses from a decomposable negation normal 
form is quite simple and can be carried out in two steps. In the first step, the consequence 
is pruned to result in a sentence that characterizes the minimal diagnoses only. In the 
second step, all instantiations of the pruned consequence are generated, each of which will 
then be a minimal diagnosis. The algorithm for extracting minimal diagnoses is given in 
Figure 12 and is explained in more detail below: 

Phase I: Cost Propagation and Pruning 

Each node in the NNF is assigned a cost as follows: 

1. The cost of a literal is given by the cost function; 

2. The cost of a conjunction is the ©-sum of costs assigned to its conjuncts; 

3. The cost of a disjunction is the minimum of costs assigned to its disjuncts. 

It is shown by Lemma 7 in Appendix F that the cost assigned to node /j, is nothing but the 
cost of the minimal instantiations it characterizes. 

In addition to this cost propagation, some pruning takes place in this phase. In partic- 
ular, if a disjunction n has a disjunct v with a higher cost than the disjunct v is removed 
from /i.^^ The intuition here is that will be contributing partial diagnoses that are not 
minimal when compared to the partial diagnoses contributed by its siblings. Moreover, all 
completions of these partial diagnoses will be more costly than some completion of a partial 
diagnosis contributed by a sibling. 



16. This is achieved by deleting the link between /i and i> in the NNF-graph. 
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Computing Minimal Instantiations 

Input: 

- (A, S, ©, JF): a cost function; 

- r: a decomposable NNF over assumables A. 

- Atoms{n, u): atoms that appear in /x but not in u for each or-node fx and its child v. 
Output: A set of A- instantiations. 

Data Structures: For each node /x in r, two fields COST(/i) and TERMS(/i) initialized to NIL. 
Pseudocode: 

minimal_instantiations(t) 

01. prune(t) 

02. instantiations(t) 

prune(t) 

01. when COSt(t) = nil do 

02. for each fi e CHiLDREN(r) do PRUNE(/i) 

03. case literal_node?(t): cost(t)-(-J'(literal_of(t)) 
04 AND_NODE?(r) : COST(r)^ ^ COSt(;u) 

^(6CHILDREN(r) 

05. OR_node?(t) : COSt(t)-«- min COST(/i)'' 

^JGCHILDREN(r) 

Children (T)-«-{/i : fi e Children(t) and COST(/i) = cost(t)} 

instantiations(t) 

01. when terms(t) = nil do 

02. if cost(t) = 00, terms(t)-(-{}; return 

03. for each fi e Children (r) do instantiations (//) 

04. case LITERAL_NODE?(r): TERMS(r)-S-{{LITERAL_OF(r)}} 

05. and_node?(t) : TERMs(T)-^- Pi terms(/x)'' 

^(6CHILDREN(r) 

06 or_node?(t) : terms (t)-«- [J extend (terms (/i), v4ioms(T, /i)) 

^(GCHILDREN(r) 

extend(T, S) 

01. if S = {} 

02. then return T 

03. else y-(-HEAD(S) 

04. Z^{{iV,v)}: J^{{V,v))=0} 

05. extend(T n Z,rest(S)) 

a. mill is defined with respect to the total order <©. When applied to an empty set, min returns the 
special symbol oo which is bigger than any cost in S. 

b. {qi, Q2, . . . , Q„} n {A,/^2, . . . , M = Ur=l Ur=lK U I3j}. 



Figure 12: An algorithm for extracting minimal diagnoses from a decomposable NNF. 
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Phase II: Computing Instantiations 

The result of Phase I is a negation normal form that characterizes minimal diagnoses only. 

To extract the minimal diagnoses, all we need then is to compute the instantiations of this 
pruned negation normal form, that is, compute its sum-of-product representation. This can 
be done recursively as follows: 

1. The instantiations of a literal /j, are {/j,}. 

2. The instantiations of a conjunction are the Cartesian product of the instantiations of 
its conjuncts.^*^ 

3. The instantiations of a disjunction are the union of the extended instantiations of its 
disjuncts. To understand this extension process, consider a disjunction ^, and one of 
its disjuncts u. Just before the instantiations of u are unioned with the instantiations 
of its siblings, they are extended by adding zero-cost literals to them. The goal of this 
extension process is to make sure that each extended instantiation of v mentions all 
atoms that appear in /i. This step would not be necessary if we were computing the 
DNF of iJ,. Note, however, that are computing the sum-of-product representation of 
fx instead, which means that each product in the sum must mention all atoms that 
appear in fj,. 

The soundness of the algorithm is given below: 

Theorem 13 After the termination o/minimal_instantiations in Figure 12, TERMs((u) = 
Minlnst{n) for every node n in the NNF-graph r. 

A detailed example of the algorithm is given in Figure 13 where cost propagation, 
pruning, and the computation of instantiations are all explicated. 

The computational complexity of the presented algorithm is given next: 

Theorem 14 Consider the algorithm MINIMAL_INSTANTIATIONS in Figure 12. The time 
complexity of this algorithm is 0{cE) where E is the number of edges in the NNF-graph 
T and c is square the number of instantiations returned by the algorithm (that is, c = 
I MinInst{T) \^). 

We have two points to stress about the above result. 

First, if the number of minimal diagnoses is small enough, the factor c is a constant, 
and the time and space complexity of minim aljnstantiations becomes 0{E) which is 
linear in the size of r. However, if the number of minimal diagnoses cannot be regarded as 
a constant, then one cannot claim linear complexity. Needless to say, however, that one can 
never do better than the size of the answer that one is trying to compute. For example, if the 
number of minimal diagnoses is exponential in the size of t, then no extraction algorithm 
can do better than exponential since the answer it has to return is exponential. 

Second, the size of the consequence generated by system_CONSEQUENCE of Figure 8 
is decided by the jointree T and the set of observed atoms E — it is independent of the 

17. This step will not be sound unless the NNF is decomposable, that is, no atoms are shared between 
conjuncts. 
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An Example of Computing Minimal Instantiations 

Consider the consequence in Figure 9 and the minimal cardinality cost function. Calling prune 
on the consequence leads to the following pruned NNF: 




(okZ,0| |okZ,0) (otZ.0) 



The cost of each node is shown next to it. The dotted links represent pruned parent-child links. 
CaUing INSTANTIATIONS on this result leads to the following: 



{({okX,0),(ok¥,l),(okZ,l)|. 
((okX,l),(okY,l),(okZ.O)|| oil 




<otz,0) 



Atoms{n, v) is shown on the arc between ji and v. TERMS (/x) is shown next to node /x. The final 
result is two minimal instantiations ^okX /\ okY /\ okZ and okX /\ okY /\ -^okZ. 



Figure 13: An example illustrating the computation of minimal instantiations. 
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specific instantiation cf) of E. However, the number of minimal diagnoses extracted by 
MINIMAL-INSTANTIATIONS in Figure 12 is very much dependent on the specific oljservation 
4>. Therefore, if two observations 4'i aiid 4'2 refer to the same set of atoms E, the cost 
of generating their consequences will be the same. However, the cost of extracting their 
minimal diagnoses may be different. 

For a concrete example of this last point, consider an n-bit adder and the following 
observations: 

- 01 : All input/output bits are low except for the first sum bit. 

- 02 : AH input /output bits are low except for all sum bits. 

The corresponding consequences of these observations will be both linear in n. However, 
the first observation leads to two minimal diagnoses — the first full adder has a broken 
xor-gate — while the second observation leads to 2" minimal diagnoses — each of the full 
adders has a broken xor-gate. Therefore, computing the minimal diagnoses of 4>i is linear 
in n, but computing the minimal diagnoses of 02 is exponential in n. 

The last few observations reveal a merit of splitting the computation of minimal diag- 
noses into two steps: characterization of all diagnoses and then extraction of minimal ones. 
This is contrary to common practice in structure-based reasoning, but it allows one to make 
more refined statements on the complexity of model-based diagnosis. 

7. Conclusion and Related Work 

We have presented a comprehensive approach for characterizing and computing minimal di- 
agnoses given a structured system description. What is most important about our approach 
is that it ties the computational complexity of diagnostic reasoning to a very meaningful 
aspect of systems: the topology of their system structures. Thus, it provides diagnostic 
practitioners with more flexibility in engineering the response time of their applications. 
This emphasis on structure has been the central theme in probabilistic reasoning lately and 
there are a number of other algorithms for importing this theme into model-based diagnosis 
(Dechter &; Dechter, 1996; Geffner &; Pearl, 1987). There are some key differences, however, 
between our proposal and the previous ones: 

- Given consequences and the theorems to manipulate them, our proposal can be viewed 
as a framework for structure-based diagnosis instead of simply an algorithm. We did 
propose a specific algorithm which utilizes jointrees. but our framework can accom- 
modate other structure-based algorithms (that do not necessarily utilize jointrees) as 
long as they provide a mechanism for applying the Intersection Corrollary efficiently 
(see (Darwiche, 1998b) for an example). 

- We have decomposed the computation of minimal diagnoses into two independent 
stages: characterization of diagnoses using consequences and then choosing a minimal 
subset of them. This separation has at least three advantages. First, it allowed us to 
offer guarantees on the size of a consequence which are independent of the number of 

18. Component consequences will depend on <f>, but not the number of and/or nodes which are added by 

SYSTBM-CONSEQUENCE. 
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(minimal) diagnoses that it characterizes. Next, it allowed us to incorporate different 
minimality criteria without having to alter the characterization algorithm. Finally, it 
inspired a device compilation technique which is discussed in (Darwiche, 1998a). 

- The time and space complexity of our algorithm depends not only on the system 
topology but also on the system observables. In particular, we have shown that the 
more observables we have, the easier it is to diagnose a system using our algorithm. 
We have also provided a refined complexity result which explicates precisely the effect 
of system observables on the complexity of structure-based diagnosis. 

- Finally, previous algorithms have rested on the language of constraints among multi- 
valued variables. Our approach uses a purely logical setting, which allows computation 
directly on Boolean syntax. This bridges the gap even further between structure-based 
reasoning and model-based diagnosis. 
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Appendix A. Representing Negation Normal Forms 

This appendix describes our representation of negation normal forms using directed acyclic 
graphs. It also provides a number of operations on the suggested representation and some 
associated conventions. The material in this appendix is necessary to understand the pseu- 
docode provided throughout the paper. 

Representation: 

An NNF-graph over atoms V is a rooted, directed acyclic graph with three types of 
nodes: literal-nodes, and-nodes and or-nodes. For each literal node fj,, literal_of(|u) = 
{V,v) where F E V and v G {0, 1}. Each node /j, in an NNF-graph represents a Boolean 
sentence as follows: 

- If |U is a literal- node and literal_of(|u) = {V,v), then /j, represents 

- the positive literal V when v = 1 

- the negative literal -^V when v = 0. 

- If is an and- node and the i*'' child of jj, represents sentence a^, then /j, represents 
the conjunction Ajttj. 

- If /U is an or-node and the child of fj, represents sentence aj, then /j, represents the 
disjunction VjOj. 

The sentence represented by an NNF-graph is the sentence represented by its root node. 
Operations: 

We have the following operations to manipulate NNF-graphs: 

- LlTERALJSroDE?(|u), AND_NODE?(/L/) , OR_NODE?(/y,): type predicates for nodes. 

- NEW -LITERAL jsrODE(F, w): creates and returns a new literal-node labeled with {V,v). 

- NEW_ANDJSrODE(): creates and returns a new and- node (with no children). 

- new_OR_NODe(): creates and returns a new or-node (with no children). 

- ADD_child(i^, |u): adds node /j, to the children of node v. 

- Children(i^): returns the children of node v. 

Conventions: 

We adopt the following conventions with respect to NNF-graphs: 

- If T is an NNF-graph, then r will also be used to denote the root of r. 

- We use true to denote an and-node with no children and false to denote an or-node 
with no children. 

- We do not show the directions of arcs assuming that they point downwards. 

- The NNF-graph rooted at some node is formed from that node and all its descendants. 

- A node and the sentence it represents will be used exchangeably. 
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Example: 

Consider the following NNF-graph: 







n8 

or 








n5 








n4 ^ 


and 


n3 


and 




(okX,0) 


(okZ,0) 


or 


false 


^ n6 
true 



This graph represents the following sentence {-tokX A {-'okZ \/ false)) V {{-' ok Z \/ false) A 
true). It can be created using the following sequence of calls: 



ni-^NEW_LITERALJSrODE(oA;Z, 0) 

n2-^NEW_OR_NODE() 

n3^NEW_OR_NODE() 

ADD_CHILD(n3, ni) 

ADD_CHILD(n3, 712) 

n4-^NEW -LITERAL jsroDE(oA;X, 0) 

n5^NEW_AND_NODE() 

ADD_CHILD(n5,n3) 



ADD_CHILD(n5,n4) 
n6-^NEW_AND_NODE() 
n7^NEW_AND_NODE() 
ADD_CHILD(n7, 713) 
ADD_CHILD(n7, ne) 
n8-^NEW_OR_NODE() 
ADD_CHILD(n8,n5) 
ADD_CHILD(n8,n7). 
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Appendix B. Representing Literals and Instantiations 

This appendix discusses our representation of literals and instantiations and provides some 
operations on these representations. The material in this appendix is necessary to under- 
standing the pseudocode which is provided throught the paper. 

Representation: 

An instantiation is represented by a set of pairs {V, v) where V is an atom and v G 
values_of(F) = {0, 1}. 

Operations: 

An instantiation L can be asserted or retracted using ASSERt(L) and retract(L). 
After asserting an instantiation L, we have the following for every pair {V, v) in L: 

- INSTANTIATED? (F) is true; 

- VALUE_of(F) equals v. 

We assume that sets of atoms are ordered , say alphabetically, where head(V) returns the 
first element of the set and rest(V) returns the tail of the set. Two operations are defined 
on ordered sets of atoms: 

- A function to compute a unique index for the instantiated values of set V:^^ 

INDEX(V) {atoms in V must be instantiated} 

01. if V = {} 

02. then return 

03. else return value_of(head(V)) + 2 * index(rest(V)) 

- A function to compute all instantiations of the uninstantiated atoms in V: 

GENERATE_INSTANTIATI0NS(V) {some atoms in V may be instantiated} 

01. if V = {} 

02. then return {{}} 

03. else F^head(V) 

04. R^generate_[nstantiations(rest(V)) 

05. if instantiated?(F) then return R 

06. else return |J U a : a G R} 

«eVALUES_OF(y) 

Examples: 

Consider the instantiation a = AA^BAD. This is represented as L = {(^4, 1), {B, 0), {D, 1)}. 
Suppose now that we call ASSERt(L). Then instantiated? (^) is true and VALUE_of(S) = 
0. Moreover, in'DEx{{A, B , D}) = 5 and we have generate_instantiations({C, D, £^}) = 
{{(C, 1), (E, 1)},{(C, 0), (E, 1)},{(C, 1), (E, 0)}, {(C, 0), (E, 0)}}. 



19. The ordering of V is essential for this function to return the same index for each instantiation of V. 
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Appendix C. Avoiding the Sharing of Assumables in Component 
Descriptions 

Consider the following figure which contains a structured system description where compo- 
nent descriptions Ac and A^; share the assumable Pwr: 





{} 



{- 



A A (Pwr A okX) D 
A A (Pwr A okX) D 
—1 Pwr 3 



A A B A (Pwr A okY) D 
.(AAB)A(PwrAokY) D 
—1 Pwr 3 



We can avoid the sharing of assumable Pwr by introducing an auxiliary node Pwr' and 
making it equivalent to the shared assumable Pwr as shown in the figure below. As a result 
of adding this auxiliary node Pwr' , we no longer have a common assumable between the 
component descriptions Ac and A^; (but we have a common non-assumable Pwr' instead). 






Ac 


Ad 


Ap,„,/ 


{} 


C A A (Pwr A okX) D 
I -.A A (Pwr A okX) D 
(_ -.Pwr' D 




{ 


A A B A (Pwr A okY) 
-.(A A B) A (Pwr' A okY) 
-.Pwr 


D D 1 
D -.D J 


J Pwr D Pwr 1 
1 —iPwr 3 —iPwr' I 



This simple technique can be always used to ensure that no sharing of assumables takes 
place between component descriptions. Note, however, that this ensurance comes at a price: 
The structured system description with auxiliary nodes (no assumable sharing) will typically 
be topologically more complex than the one where sharing is allowed. This is clearly the case 
in the above example where the no-sharing-of-assumables led us to transform the system 
structure from a tree to a graph. 
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Figure 14: Modeling a component with multiple fault modes and multiple input values. 

Appendix D. Extending the Framework to Multivalued Variables 

We show in this appendix how to represent constraints among assumables using two meth- 
ods: 



1. introducing auxiliary atoms into the system structure; 

2. using multivalued variables instead of atoms. 

The two approaches will be illustrated using the following example. 

Consider a component which has one input and one output: the input is either low, mid 
or high and the output is either on or off . Furthermore, suppose that this component has 
two fault modes Mi and M2. Such a component can be represented as shown in Figure 14. 
In this figure, we have two auxiliary atoms, Ci and C2, which represent constraints. The 
first constraint, Ci, ensures that low, mid and high are mutually exclusive and exhaustive. 
The second constraint, C2, ensures that Mi and M2 are mutually exclusive. The atoms 
Ci and C2 should always be observed to enforce these constraints. That is, for any system 
observation cf), we must actually invoke system_CONSEQUENCE on ^ A Ci A C2 to make sure 
that the two constraints are enforced. 

This approach is relatively reasonable but the representation is not as concise as one 
would expect. A better solution for representing such systems involves the use of multivalued 
variables instead of atoms. This solution will be described next and it leads to a more 
concise, efficient representation. 
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{} 


( ((I, lorn) V (I, mid)) A (Mode, ok) 
J (/, fclgfc) A (Mode, ofc) 
S (Mode, Ml) 
(Mode,M2) 


u u u u 


(0,off) ^ 
(0,on) 1 
(0,off) ( 
(0,on) ) 



Figure 15: An SSD using multivalued variables. 

An atom is a special case of a variable in that it only has two values. A general variable, 
however, can have any finite number of values. In the previous example, it would be best 
to represent the input as a variable with three values low, mid and high; to represent the 
output as a variable with two values on and off; and to represent the mode as a variable 
with three values ok. Mi and M2. 

Using variables instead of atoms, the system in Figure 14 can be encoded as given in 
Figure 15. This is clearly a more compact and efficient representation. 

More generally though, we extend our framework to variables by taking the set of non- 
assumables P and the set of assumables A to be sets of variables, instead of atoms. We 
next discuss formally the syntax and semantics of the resulting variable-based logic, and 
then show what modifications are needed to our presented definitions and pseudocode (only 
two changes are needed). 

Syntax and Semantics In this variable-based framework, a literal is a pair {V, v), where 
F is a variable and u is a value. A sentence is either a literal, the negation of a sentence, or 
the combination of two sentences using a standard logical connective such as A, V or D. This 
defines the syntax of a propositional logic with multivalued variables. This logic includes 
standard propositional logic as a special case (that is, when all variables are binary). The 
semantics of this logic is defined in the usual way. That is, a model is a function that 
assigns one value to each variable. Moreover, a model uj satisfies a sentence under the 
following conditions: 

ll) 1= (F, v) if (jj assigns the value v to variable V; 

Lo 1= -iQ! if o; ^ a; 

(jj \= a y f3 \i (jj \= a ov (j) \= j3; and 

C(;|=Q!A/5ifa;|=Q; and w |= j3. 

A sentence is satisfiable (consistent) iff it has at least one model. This also leads to the 
usual notions of entailment and validity. 

It is important to stress that in this variable-based logic, the term -i(F, u) is not con- 
sidered a literal in general. Note that if V is not a binary variable, then -i(F, «) would 
be equivalent to a clause (F, vi) V (F V2) V ... V (F, u„) where ui, . . . , ?;„ are the values of 
variable F not equal to v. Having said that, an instantiation of a set of variables V is 
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defined as a conjunction of literals, one literal for each variable in V. This means that 
low) V (O, on) V (M, ok) is not an instantiation according to the previous definition. 

Modifications In the extended framework, all definitions and theorems remain intact 
with two exceptions: 

1. Since variables can have more than two values, we need to generalize the function 
INDEX in Appendix B which computes unique indices for instantiations: 

INDEX(V) 

01. (i, 6)-(-AUX_INDEX(V) 

02. return i 

AUX-INDEX(V) 

01. if V = {} 

02. then return (0, 1) 

03. else (z, 6)-(-aux_index(rest(V)) 

04. y^HEAD(V) 

05. return (i + VALUE_OF(y) * b, \ VALUES_OF(y) | *b) 

Here, we are assuming that the values of each variable are {0, 1, . . .}. Suppose, for 
example, that we have V = {Vi,V2,V3} where Vi and V3 have the values {0,1} 
and V2 has the values {0, 1,2}. We then have the following instantiations and their 
corresponding indices: 



Vi 


V2 


V3 

















1 





1 








1 


1 





2 








2 


1 















1 




1 







1 


1 




2 







2 


1 



2. Since assumables can have more than two values. Definition 12 of a cost function 
should be changed so that the condition: 

- T maps each A-literal into a cost in S, where each literal or its negation has 
cost 0; 

reads as 

- maps each A-literal into a cost in S, where for each variable F G A, at least 
one literal {V, v) must have cost 0. 
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Figure 16: A structured system description of a digital circuit. The arc from node A to 
node D has been cut out given that node A is observed. 



Appendix E. Cutting Outgoing- Arcs of Observed Nodes 

Consider the structured system description given in Figure 3 and suppose that we want 
to compute the diagnoses for system observation (p = -^A A -^E. This computation can be 
performed with respect to the SSD in Figure 3 or, equivalently, it can be performed with 
respect to the SSD in Figure 16. This SSD is obtained from the one in Figure 3 as follows: 

1. The arc going from node A to node D is cut out. 

2. Every occurrence of atom A in the component description A^) is replaced with false 
since A appears negated in 4>?^ 

It is easy to show that the consequence of (j) is the same with respect to both SSDs. Working 
with the modified SSD is preferred, however, since it has a tree system structure. 

We have the following observations about this process of cutting out arcs from the 
system structure: 

• One can cut out all outgoing-arcs of any node that appears in the system observation. 
In our previous example, we can also cut out the arc from ^4 to C but we must also 
modify Ac accordingly.^^ 

• Even if cutting out arcs does not lead to a tree structure, it would typically lead to 
reducing the size of cliques of the resulting system structure. In fact, the reduction 
could move some problems from being practically unsolvable using structure-based 
methods to being solvable. Consider the results reported in (El Fattah & Dechter, 

20. A would be replaced by true if it did not appear negated. 

21. This will lead to a disconnected system structure. In such a case, we have to compute the consequence 
of each disconnected piece and then conjoin all consequences. 
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1996) for an example which show jointree statistics for the benchmark circuits pro- 
posed in (Beglez & Fujiwara, 1985). The following table shows the maximal-clique 
sizes reported for a few of the circuits together with the maximal-clique sizes after the 
arcs outgoing from observed nodes (root nodes) are cut out:^^ 



Circuit 


#nodes 


maximal-clique size 


maximal- clique size 






before cutting arcs 


after cutting arcs 


c432 


196 


28 


22 


c499 


243 


25 


10 


c880 


443 


28 


10 


cl355 


587 


25 


10 



There is clearly a dramatic change in the maximal-clique size for the last three circuits. 
In fact, there are only a few cliques which have the maximal size or a size close to 
it as is shown in (El Fattah &; Dechter, 1996). This makes the approach reported 
in this paper appropriate for the last three circuits. The approach, however, is not 
appropriate as is for the first circuit. 

We close this section by stressing again that this process of cutting out arcs can lead 
to a very dramatic reduction in maximal- clique size. Therefore, it should be exploited 
whenever possible (Darwiche &; Provan, 1997). 



22. The jointrees after cutting out arcs are computed using the algorithm reported in (Huang &; Darwiche, 
1996). 
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Appendix F. Proofs 
Proof of Theorem 1 

Suppose that Cons^{(f>) = aiV. . .Va^ where each is an A-instantiation. Then A[J{(f>} \= 
Q!i V . . . V Q!„ by definition of a consequence. We need to prove two things: 

1. Each Qfj is a diagnosis. 

It suffices to show that A U {^,ai} is consistent. Suppose that A U {^,ai} is not 
consistent. Then A U {0} |= -laj and 

A U {(^} ^ ai V . . . V ai_i V a^+i V . . . V a„. 

Taking /3 = ai V . . . Vq!„, we contradict Condition 3 in the definition of a consequence. 

2. If /3 is a diagnosis, then (3 must be equal to some aj. 

Suppose that /3 is a diagnosis. Then A U {(p, /?} must be consistent. Moreover, A U 
{(f), (3} 1= tti V . . . V ttn since A U {({)} |= ai V . . . V «„. This means that /3 must be equal 
to some ttj; otherwise, A U {0, /3} would be inconsistent which we know it is not. g 

Proof of Theorem 2 

Follows directly from the definition of a consequence and the characterization theorem, q 
Proof of Theorem 3 

First, observe that each clause a in Aq can be written as the disjunction of two clauses 
ap V q;a where ap is the projection of a on non-assumables P and oa is the projection 
of a on assumables A. Moreover, mentions every atom that appears in ap. Therefore, 
either 

1. ^1= ap: Hence, (f> \= ap V a a and (f> A (ap V a a.) is equivalent to 0; or 

2. (f) \= -lap: Hence, cf) A (ap V a a) is equivalent to (f) A aA- 

Therefore, Aq U {(f)} is equivalent to (j) conjoined with every aA whose matching ap is 
inconsistent with (p. Therefore, Cons^ {(j)) is the conjunction of all oa's whose matching 
ap is inconsistent with (j). g 

Proof of Theorem 4 

We prove this theorem in two steps: 

1. (7ons®u^((^) = Cons^{(t>) A Cons^{(j)). 

Recall that Cons^^^ {(f)) is the strongest A-sentence entailed by 6 U F U {(j)}. By 
Lemma 3: 

- There is a database ©' that does not share atoms with and Q'V^{(j)} = 0U {</>}. 

- There is a database F' that does not share atoms with 0, and F' U {(j)} = F U {(j)}. 
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Note that no clause in @'[J{(f)} can resolve with any clause in T'u{(p} since there are no 
common atoms between ©' and F'. Therefore, the A-clauses derived from @'[JT'[J{(f)} 
are those derived from 6' U {4>} in addition to those derived from T' U {4>}-^^ It then 
follows that the strongest A-sentence entailed by 0' U F' U {</>} — which is equivalent 
to the conjunction of all A-clauses entailed by 6' U F' U {$} — is equivalent to the 
strongest A-sentence entailed by 6' U {0} conjoined with the strongest A-sentence 
entailed by F' U {^}: 

Cons®''^^' {^) = Cons®'{^) A Cons^' {^), 

which leads to: 

(7ons®^^((^) = Cons'^{(f>) A (7on/(0). 

2. Cons®{4>) = Cons®(00) and Cons^{(P) = (7ons^(0r). 

The atoms that appear in (f) but do not appear in do not appear in either. 
Therefore, they do not affect the strongest A-sentence entails by U {(/>}. The same 
is true for F and 0r- 

Therefore, 

Cons®^^(0) = Cons®{4>e) A Cons^{4>r). □ 

Lemma 3 Suppose that A is a database and (f) is a set of literals. There exists a database 
A' such that 

1. A' does not share atoms with (j); and 

2. A' U (/) is equivalent to AU ^. 

Proof of Lemma 3 

We show how to construct A' from A. Suppose that A is in clausal form. Each clause a in 
A must satisfy one of the following conditions: 

1. a does not share atoms with 0; 

2. a shares some atoms with and we have either: 

(a) a and share a literal, which means that a is subsumed by 0; or 

(b) a and share no literals, which means that a resolves with some literals in to 
yield a clause (3 that does not share atoms with 0. Moreover, 0U{/3} is equivalent 
to (/) U {a}. 

We can obtain A' from A as follows. For each clause a in A: 

1. if a is in Class 1 above, add a to A'; 

2. if a is in Class 2a above, ignore a; 

3. if a is in Class 2b above, add the resolvant f3 to A'. 
It should be obvious that A' U (;6 is equivalent to A U 0. □ 
23. An A-clause is a clause which contains A-literals only. 
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Proof of Theorem 5 

It suffices to prove that Cons^ij V /3) = Cons^{j) V Cons^{^). The theorem then follows 
from observing that (f) = Val*/* where a ranges over instantiations of atoms S that are 
consistent with 0. This leads to Cons^{(f)) = Cons^{\J ^{4> A a)) = Va Cons^{(f) A a). 

1. Cons'^ij) V Cons^iP) \= Cons^{j V 

By definition of a consequence, we have Cons^(j) \= Cons^('j\/ f3) and Cons^{P) \= 
Cons^ijy P). Therefore, we must have Cons^{j) V Cons^{(3) |= Cons^i-y \J (3). 

2. Cons^(7 V/3) \= Cons^{^) V Cons^(/3). 

By definition of a consequence, we have Au{7} |= Cons^{'~^) and Au{/3} |= Cons^{(3). 
Therefore, (A U {7}) V (A U {/?}) ^ Cmis^{-^) V Cmis^{l3). This is equivalent to 
A U {7 V /?} 1= Cons^(7) V Cons'^(/3), which means that Cons^i^i) V Cons^{l3) is an 
A-sentence implied by AU{7V/3}. By definition of Cons^{j\/ f3), we must then have 
Cons^(7 V/3) H Cons^(7) V Cons^(/3). □ 

Proof of Theorem 6 

Suppose that (P, A, Q, A) is a structured system description according to Definition 11. We 
want to prove that A U {a} is consistent for any instantiation a of the assumables A. 
The proof is by induction on the system structure. 

Base case: Suppose that the system structure has a single node O. Then A = Aq and 
it follows from the definition of Aq that A U {a} is consistent for any instantiation a of the 
assumables A. 

Inductive step: Suppose that we have a structured system description {P,A,Q,A) 
satisfying the above property. It suffices to show that the property will still hold after 
we add a leaf node O to the system structure together with its component description 
(I, O, Ao). That is, we need to show that A U Aq U {a} is consistent for any instantiation 
a of the assumables A given that A U {a} is consistent. 

It suffices to show that A U Ao U {a, (3} is consistent for some instantiation /3 of I U {O}. 
Since Au{q!} is consistent, there must exist some instantiation (3i of I such that Auja, 
is consistent. Moreover, since Aq U {a, Pi} is consistent (by definition of a component 
description), there must exist some instantiation (3o of O such that Ao U {a,(3i,(3o} is 
consistent. Consider now the following: 

- Ao U {a, Pi, Po} is equivalent to {a, /3i,/3o} since every atom that appears in Ao 
appears in the instantiation {a,Pi,Po} and, hence, {a,Pi,Po} |= Ao- 

- A U {a,Pi,Po} is consistent since A U {a, Pi} is consistent and the atom O does not 
appear in A U {a, Pi}. 

Therefore, A U Ao U {a, Pi, Po} = AU {a, Pi,Po} is consistent and, hence, A U Ao U {a} 
is consistent, q 
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Proof of Theorem 7 

First, the call empty?(q!p n 7) on line 06 returns true iff no literal in ap appears in 7. 
Since ap represents a clause of some atoms in I U {0} and 7 represents an instantiation of 
I U {O}, then every literal in ap is contradicted by some literal in 7. Therefore, this call 
will return true iff 7 |= -lap. 

Second, the call GREAT _DISJUNCTION(q:a) on line 07 returns a disjunction of the literals 
in OA- 

Therefore, for each instantiation 7 of IU{0}, the function COMPONENT_CONSEQUENCE 
computes a conjunction of disjunctions where each disjunction represents an whose 
matching ap is inconsistent with 7. The algorithm is then sound given Theorem 3. □ 

Proof of Theorem 8 

Line 01 will generate 21^1"'"^ instantiations, which is the number of times that lines 02-08 will 
repeat. Each of lines 02, 03 and 08 take 0{n) time. Line 04 takes 0(1) time. The loop in 
lines 05-07 will repeat s times. Line 06 takes 0{n) time since EMPTY? can be implemented 
in 0{n) time. Line 07 also takes 0(n) time since great _Dis junction takes 0{n) time. 
Therefore, lines 05-07 take 0{sn) time and, hence, lines 02-08 take 0{sn). Finally, lines 01- 
09 take 2l^l+^0(,sn) time which is 0(.sn2l^l). The space complexity is no worse than the 
time complexity since adding a node or arc to the computed consequence takes 0(1) time. □ 

Proof of Lemma 1 

First, we need to prove that atoms Cj are instantiated to some a when the call 
GLlQUE_GONSEQUENGE(Ci) is made. This follows immediately since GLlQUE_GONSEQUENGE(Cj) 
is only called on line 06 of SUBTREE_GONSEQUENGE and a is instantiated on line 04 of the 
same function. 

The call GLlQUE_GONSEQUENGE(Ci) computes the following expression: 

f\ GONSEQUENGESo[lNDEX(0 U ^o)]- 

OeCOMPONENTS_OF(Ci) 

Therefore, all we need to show is that 

Cons'^'{a)= /\ GONSEQUENGESo[lNDEX(0 U ^o)]- 

OeCOMPONENTS_OF(Ci) 

Given Theorem 7, it is enough to show that 

Cons^'{a) = /\ Cons^° (aAo); 

OeCOMPONENTS_OF(Ci) 

where (O, ^o, Aq) is the component description of O and a^o is the projection of instan- 
tiation a on the atoms {0} U Qo appearing in Aq- 

First note that a is an instantiation of the atoms in clique Cj. Therefore, the common 
atoms between any two Ao's must appear in a and, hence, the Decomposition Theorem 
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gives: 



Consp^ {a) 




a 



A 



OeCOMPONENTS_OF(Ci) 



Proof of Lemma 2 

We start with some concrete examples of the used notation. In Figure 5, we have: 



Moreover, given that (p = A A E, then 0i2 = A and (f)2i = A A E. 
Soundness of caching 

We will prove the lemma ignoring the caching on lines 00, 01 and 11. The soundness of this 
caching process follows because according to the lemma, any two calls to 
SUBTREE_CONSEQUENCE(Ci, Cj) will return equivalent sentences if they are made under the 
same instantiation a of sepset Sij. Line 00 computes a unique index / for each instantiation 
of Sij which is used on line 01 to check if a previous call with respect to this instantiation 
has been made. If not, the computed sentence is cached on line 11 under the index 

Proof by induction 

First, we need to prove that atoms Sij are instantiated when the call 
SUBTREE_CONSEQUENCE(Cj,Cj) is made. subtree_CONSEQUENCE is Only called in two 
places: 

1. subtree_consequence(C2,Co) is called on line 03 of system_consequence: the 
sepset 5^0 = which is trivially instantiated. 

2. SUBTREE_CONSEQUENCE(Cfc,Cj) is Called on line 08 of subtree_CONSEQUENCE: the 
sepset Ski is a subset of clique Ci and the atoms of this clique are instantiated on line 04. 
Therefore, the atoms of Ski are instantiated when this call to subtree_CONSEQUENCE 
is made. 

The rest of the proof is by induction on the structure of the jointree. 

Base case: Ci has a single neighbor Cj. 

In this case, lines 08 of subtree_CONSEQUENCE will not be executed. Therefore, 
the function SUBTREE_CONSEQUENCE is only computing the disjunction of all calls to 




while 



A21 =AcUAe = { 



' AAokX D -C, ' 

-^A A okX D C, 

CADAokZ D E, 

^ -^{C AD) A okZ D -nE ^ 



> . 
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clique_consequence(C2) where each such call is made with respect to some instantia- 
tion a of clique Ci which is consistent with fS fMpi- 
Given Lemma 1, all we need to show then is that: 

Cons^'i {(3 A ^ij) = V Cons^'ia A ^i). 
a 

By the Case-Analysis Theorem, we have: 

Cons^'i (/3 A (t)ij) = y Cons^'i (a A /3 A 

a 

where a ranges over all instantiations of clique Ci which are consistent with f3 A ^ij. We 
also have 

- Aij equals Aj and ^ij = 4>i since the subtree Tij contains only the clique Ci. 

- a f\ (3 f\4>ij is equivalent to a A /3 A 0^ . 

- a A P A(f)i is equivalent io a A 4>i since a A (5 A (pi is consistent and the atoms in (3 are 
a subset of those in a A ^j. 

Therefore, 

Cons^» {(3 A 4>ij) = y Cons^'ia A 4>i), 

a 

which is what we need to show. 

Inductive step: Clique Ci has more than one neighbor. 

The induction hypothesis is that the call SUBTREE_CONSEQUENCe(Ca;, Cj) on Une 08 will 
return a sentence equivalent to Cons^'''{ai~i A 4>ki) where a^i is the instantiation of sepset 
Ski- 

Given this induction hypothesis, we want to show that the call 
SUBTREE_CONSEQUENCE(Cj, Cj) will return a sentence equivalent to Cons'^'i {(3 A4>ij) where 
(3 is the instantiation of sepset Sij before the call is made. 

Given the induction hypothesis and Lemma 1, lines 02-10 are setting Disj to the follow- 
ing expression 

V Cons^'ia) A f\ Cons'^^'{aki A <j)ki) 

a. k^j 

where a is the instantiation of clique Ci generated in line 03 and aki is the project of 
instantiation a on sepset Ski- Note here that a is guaranteed to be consistent with and 
(3 (see the pseudocode of generatejnstantiations). 

All we need to show then is that this computed expression is equivalent to Cons'^» {(3 A 
4>ij). By observing that Ajj can be decomposed into Aj and JS.ki where k / j, this equiva- 
lence can be proven using the Intersection Corollary. Specifically, given that the atoms in 
clique Ci contain all atoms that are common between any two subtrees T^j, the Corollary 
gives: 

Cons^'i {(3 A (l)ij) = V Cons^' (a A (/3 A 0^^) .) A /\ Cons^*' {aki A (/3 A 0^^)^.). 
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Note that {^ij)^ = 4>i and {4>i3)ki ~ 4>ki- This leads to 

Cons^'i iP A (t>ij) = V Cons^' {a A Pi A A /\ Cons^"* {aki A f5ki A <i>ki)- 
a. k^j 

The atoms in /3j A are a subset of the atoms in a. Therefore, a \= Pi A (jji given that 
a and Pi A (j)i are consistent. Similarly, the atoms in P^i are a subset of the atoms in a^i. 
Therefore, aki \= Pki given that a^i and Pki are consistent. This leads to 

Cons^'i {P A (l)ij) = V Cons^'{a) A /\ Cons^^'iaki A cjiki). □ 
a. k^j 

Proof of Theorem 9 

To prove that the function system_CONSEQUENCE terminates, it is enough to prove that the 
calls to SYSTEM_CONSEQUENCE will not recurse infinitely. To show this, note that all recur- 
sive calls made by SUBTREE_CONSEQUENCE(Ci, Cj) are of the form 
SUBTREE_CONSEQUENCE(Ca;,C,y) where the arc Cx-Cy belongs to the subtree Tij. There- 
fore, the number of arcs visited by a recursive call is smaller than the number of arcs visited 
by its parent call. The boundary condition is when a call SUBTREE_CONSEQUENCE(Cj, Cj) 
is made and subtree Tij has a single clique in it. 

Proving that the function SYSTEM_CONSEQUENCE returns the desired consequence fol- 
lows directly from Lemma 2. Consider the jointree T' and its corresponding A' that results 
from adding the clique Co = as a neighbor to the pivot clique C^. We have T^o = T, 
A^g = A, and (f)zo = 4>- Moreover, the sepset Szo is empty and it has one instantia- 
tion true. Therefore, by Lemma 2, the call on line 03 of system_CONSEQUENCE returns 
Cons^'^o(^irue A (pzo) = Cons^{(f)). □ 

Proof of Theorem 10 

Suppose that each component consequence CONSEQUENCESp[.] is in decomposable NNF. 

That the sentence returned by system_CONSEQUENCE is in NNF follows immediatly 
given that only disjunctions and conjunctions are constructed in Figure 8 (no negations). 

To prove that the sentence is decomposable, we need three results: 

• No assumables are shared between component descriptions, which follows from the 
definition of a structured system description. 

• The NNF returned by CLlQUE_CONSEQUENCE(Ci) mentiones only assumables that 
appear in component descriptions assigned to clique Cj. This follows immediately 
from examining the pseudocode of CLlQUE_CONSEQUENCE(Cj). 

• The NNF returned by SUBTREE_CONSEQUENCE(Cj, C^) mentions only assumables that 
appear in component descriptions assigned to cliques in subtree Tij. This can be shown 
by induction on the structure of a jointree. 

To prove that the returned NNF is decomposable, all we need to show is that whenever a 
conjunction is constructed in Figure 8, the conjuncts are guaranteed to share no assumables. 
There are two places where conjunctions are constructed: 
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1. CLIQUE-CONSEQUENCE: Conj is set to the conjunction of consequences that corre- 
spond to distinct component descriptions. Therefore, they are guaranteed to share no 
assumables. 

2. SUBTREE-CONSEQUENCE: Conj is set to the conjunction of a NNF returned by 
CLlQUE_CONSEQUENCE(Ci) together with NNFs returned by 
SUBTREE_CONSEQUENCE(7A;j). These NNFs are guaranteed to share no assumables 
since chque Cj and subtrees Tki share no assumables. 

Therefore, the NNF returned by SYSTEM_CONSEQUENCE must be decomposable. □ 

Lemma 4 In Figure 8, the number of calls to SUBTREE_CONSEQUENCE(C.j, Cj) that will not 
return from line 01 is no more than 2l'^«\-^L We will refer to such calls as non-cached calls . 

Proof of Lemma 4 

By Lemma 2, the sepset Sij is instantiated when SUBTREE_CONSEQUENCE(Ci, Cj) is called. 
Moreover, the index of such instantiation is the key used in the cache lookup on line 01 
of SUBTREE_CONSEQUENCE. The total number of possible keys is 2l'^'-'\^l since this is 
the maximum number of distinct instantiations generated for Sij (note that atoms E are 
instantiated during the run of the algorithm). In case of a cache hit on line 01, the function 
returns immediately and the call is considered cached. In case of a cache miss, the call is non- 
cached and it will lead to an insertion into the cache on line 11 of SUBTREE_CONSEQUENCE. 
Since there are no more than 2l'^'-'\^l keys, there are no more than 2l'^'-'\^l cache insertions 
and, hence, no more than 2l'^'j\^l cache misses. Therefore, the maximum number of non- 
cached calls is 2l'^y \-^l. □ 

Proof of Theorem 11 

We will use the following notation: 

• di =def Yl I ^oU{0} |. 

OeCOMPONENTS_OF(Ci) 

• ^ij =def di+ \ Ci\ Sij \ E I + ^ I Ski I where Ck is a neighbor of clique Ci. 

Consider the following observations: 

• The call GENERATE_INSTANTIATIONS(Ci) on line 03 of SUBTREE-CONSEQUENCE takes 
0(2l'''\'^'-' \^l) time, which is also the number of instantiations it returns and the num- 
ber of times that lines 04-10 of SUBTREE_CONSEQUENCE will repeat. 

• The calls ASSERt(q;) and RETRACt(q;) on Hues 04 and 10 of SUBTREE_CONSEQUENCE 
take 0(1 Ci \ Sij \ E |) time each since the size of a is 0{\ Ci \ Sij \ E |). 

• The call CLiQUE_CONSEQUENCE(Cj) on line 06 of subtree_CONSEQUENCE takes 0{di) 
time. 

• The call subtree_consequence(Ca;, Ci) on line 08 takes 0{\ Ski I) time if it is cached. 
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Therefore, lines 04-10 of SUBTREE_CONSEQUENCE take 0{eij) time assuming that all calls to 
subtree_consequence(Ca:, Ci) are cached. ^'^ If we assume that the number of neighbors 
per clique is a constant and that the number of components per clique is also a constant, 
then Eij = 0{\ Ci \). We will indeed make this assumption in the following proof. 

We first show that the time of all non-cached calls to SUBTREE_CONSEQUENCE(Cj, Cj) is 

|C|2l^\E|). 

The proof is by induction on the structure of the jointree: 

• Base case: Clique Ci has only one neighbor Cj. 

Line 08 does not execute and the time of a single non-cached call to 

SUBTREE_CONSEQUENCE(Ci,Cj) is 0{\ Ci \ 2^^^\^'^\^^) . By Lemma 4, there are no more 
than 2l'^y\-^l non-cached calls to SUBTREE_CONSEQUENCE(Cj, Cj). The total time of 
non- cached calls is then 

2\^ij\^\0{\ d I 2l'^'\'^'^'\^l) = 0(1 d I 2'^Sij\E\+\Ci\Sij\E\'j^ 
which is equal to 0{\ Ci \ 2l^'\^l) because Sij is a subset of Cj. 

Since Cj has only one neighbor Cj, the subtree Tij contains only the clique Cj. There- 
fore, 

0(1 Ci I 2l^AE|) = OiY, I C I 2l^\E|). 

ceTij 

• Inductive step: Clique Cj has more than one neighbor. 

Suppose that for some k, the time of all non-cached calls to 
SUBTREE_CONSEQUENCE(Cfe,Cj) on line 08 is 

0{J2 |C|2l^\^l). 

By Lemma 2, the time of all non-cached calls to SUBTREE_CONSEQUENCE(Cj, Cj) is 
then 

2l'5.v\E|^)(|^^^ |2lcA5,v\E|) + TOiY |C|2l^\^l), 

^ ^ ^ — ^ — ^ 

cost assuming recursive calls are cached > '1^ - 

cost of non-cached recursive calls 

which reduces to 

0(|Cj |2l^AE|) + ^0( ^ |C|2l^\E|) 
since iSjj C Cj, and then to 

0{J2 |C|2l'^\^l) 

CeTij 

since subtree 7jj consists of clique Cj and the subtrees Tki where k ^ j. 



24. To get the total cost of lines 04-10, we must also add the cost of non-cached calls to 

SUBTRBE_CONSEQUENCE(Cfc, Ci). 
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We are now ready to bound the running time of SYSTEM_CONSEQUENCE: line 01 costs 
0(1); line 02 costs 0{\ E |) time; and line 03 is bound by the cost of all non-cached calls to 
SUBTREE_C0NSEQUENCE(^, 0), which is 

0( J2 |C I 2l'^\^l) = 0(^ |C|2l^\^l) 
cer.'o ceT 

since T^'g = T. Therefore, the time of SYSTEM_CONSEQUENCE is 0{^ceT I ^ I 2l'^\-^l). 

This is also the space complexity of the returned NNF since the addition of either a 
node or an arc to the NNF takes 0(1) time. □ 

Proof of Theorem 12 

• Reflexive: 

For all a G S, a © = a and, hence, a<©a. 

• Transitive: 

Suppose that a<Q)h and h<Q)C. Then these exists x and y such that a® x = h and 
b® y = c. Moreover, a ® x ® y = c and, hence, a<©c. 

• Anti-symmetric: 

Suppose that a<(^h and h<(^a. Then these exists x and y such that a® x = h and 
b ® y = a. Moreover, a ® x ® y = a. This leads to a; © y = and, hence, a; = and 
y = 0. Therefore, a = b. 

• Total: 

For all a and 6, either a©c = 6 or a = 6©c for some unique c. Hence, for all a and 
6, either a<^b or b<^a. g 

Lemma 5 The addition operation of a cost function satisfies the following properties: 

1. a ®b = a implies 6 = 0. 

2. If a<^b, then a ® c<^b ® c. 

Proof of Lemma 5 

1. We have a © = a. We also have that a © 6 = a for some unique b. Therefore, 6 = 0. 

2. If (i<©6, then a ® x = b and x ^ 0. We then have a©2;©c = 5©c, which leads 
to a © c <0 b ® c. Since a©a;©c = 6©c for a unique x, and since a; 7^ 0, we have 
a © c / 6 © c and, hence, a © c<(^b ® c. q 

Lemma 6 In Figure 12, if the instantiations in set T have the same cost c, then the 
instantiations in EXTEND(T, S) will have the same cost c. g 

Moreover, we will refer to the instantiations in extend(T, S) as the zero extensions of the 
instantiations in T. 

The following lemma is with respect to the algorithm in Figure 12. 
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Lemma 7 After termination of the algorithm in Figure 12, we have the following for every 
node jjL in the NNF-graph r and every instantiation a in TERMS(|u); 

J. a ^ H; 

2. COSt(|u) = T{a); and 

3. a is an S -instantiation where S are all the atoms appearing in the NNF-graph rooted 
at fj,. 

Proof of Lemma 7 

The proof is by induction on the structure of the NNF-graph r. 
Base case: /U is a leaf node. 

• Case I: |U is a hteral-node. Then cost(|u) = :r(LlTERAL_OF(|u)) = .F({literal_of(|u)}) 
and TERMS(|u) = {{literal_of(|u)}}. The properties hold. 

• Case II: fi = true is an and-node. Then COSt(|u) = = .?-'({}) and TERMS(|u) = {{}}. 
The properties hold. 

• Case III: jj, = false is an or-node. Then COSt(|u) = oo and terms(/u) = {}. The 
properties hold. 

Inductive step: |U is a node with children. 

Suppose that the property holds for each child /Uj of /U and that a G terms (/u). 

• Case I: fi is an and-node. Then a has the form /\j where G TERMS (/^j). 

1. By the induction hypothesis, ctj |= fj,i. Hence, Aj |= Aj W ^i^d o; |= /U since 
A* = Ai 

2. By the induction hypothesis, COST(|Uj) = T{ai). Prom line 04 of PRUNE, COSt(|u) = 
0jCOST(|Uj) and, hence, cost(|u) = = J^{a). 

3. S = Uj Sj where Sj are the atoms appearing in child /Uj. By the induction hy- 
pothesis, each ai is an Sj-instantiation. Therefore, a must be an S-instantiation. 

• Case II: n is an or-node. Then a is the zero extension of some ai G terms (/Uj) where 
m is a child of ^ and cost(ju) = COSTduj). 

1. By the induction hypothesis, ai \= m. Therefore, a \= \= m \= y ^ fXi |= /j,. 

2. By the induction hypothesis, COST(/i,j) = J^{ai) and by Lemma 6, J-'{ai) = 

Hence, COST(jUj) = J^{a). Since COST((Uj) = cost((u), we must then have 
COSt(/u) = T{a). 

3. By the induction hypothesis, each ai is an Sj-instantiation where Sj are the 
atoms appearing in fj,i. By calling EXTEN'D{TERMS{jj,i), Atoms{iJ,, Hi)), we are 
extending each ai G TERMS(/L/i) with one literal for each atom in Atoms{iJ,, m). 
Since Atoms {ji ^ jii) = S \ Sj, the result of this extension must then be an S- 
instantiation. g 
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Proof of Theorem 13 

In this theorem, Minlnst is the set of minimal instantiations with respect to the cost function 
(^toms(|u), S, ®, ^) where Atoms{iJi) are all atoms appearing in the NNF-graph rooted at 
11. 

The proof is by induction on the structure of r. 
Base case: fx is a leaf node. 

• Case I: /j, is a literal-node. Then terms(ju) = {{literal_of(ju)}} = Minlnst {/j,). 

• Case II: fj, = true is an and-node. Then terms(/u) = {{}} = Minlnst{fj,). 

• Case III: n = false is an or-node. Then terms(/l/) = {} = Minlnst{fj,). 

Inductive step: fx is a node with children. 

Suppose that TERMS{fj,i) = Minlnst{fj,i) for each child fj,i of /U. 

First direction: If a G Mm/nsi(/u), then a G terms(/u). 
Suppose that a G Minlnst{fj,). Then a \= fj,. 

• Case I: jj, is an and-node. 

Since r is a decomposable NNF, the atoms that appear in each must be disjoint. 
Therefore, a must have the form /\j ai where ai is the projection of a on the atoms 

in Moreover, we must have a-i |= /ij for each a-i. It suffices then to show that 
ttj G TERMS(/L/i). Suppose that a-i ^ TERMS(/Ui) for some aj. By the induction 
hypothesis, Ofj Minlnst {/li). Since Oi \= fii, we must then have COST(jUj) <^ T{ai). 
Now let a' be the result of replacing this ai in a with some fii G TERMsduj). Then 
a' 1= /i since /3.j |= m by Lemma 7. Moreover, the cost of a' must be less than the cost 
of a by Lemma 5. Therefore, a cannot be in Minlnst (p) which is a contradiction. We 
then conclude that aj G TERMS(jUj) and, hence, that a G TERMS(/l/). 

• Case II: fj, is an or-node. 

Since a |= /U, we must have a |= Hi for some child Hi of fi. Let ai be the projection of 
a on the atoms in jii. Then J-{ai) <0 J-{a)^ a.i \= [li and we have one of two cases: 

1. ai G TERMS(/U2): Then T{a.i) = COST{fj,i) by Lemma 7 and we have two cases: 

(a) J^{ai) =0 J^{a): a is then a zero extension of and, hence, a G TERMS(|u). 

(b) J-{ai) <0 J^{a): any zero extension of will both entail fx and have a lower 
cost than a. This contradicts a G Minlnst{fj,) and the case is impossible. 

2. ai TERMS(/U2): By the induction hypothesis, ai Minlnst{fj,i). Since |= /L/j, 
this means that COST(/i.j) <0 J^(ai). Therefore, the zero extension of any /?i in 
TERMS (/Uj) will both entail /j, and have a smaller cost than a. This contradicts 
a G Minlnst {/j.) and the case is impossible. 
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Second direction: If a G TERMS(|u), then a G Minlnst{iji). 
Suppose that a G terms(/u). We need to show two things: 

This follows by Lemma 7. 

• J^{a) = ^(7) for some 7 G Minlnst{fj,): 

Suppose that 7 G Minlnst{ij,). We just showed above that 7 G TERMs(/i). By 
Lemma 7, all instantiations in TERMS(/u) have the same cost, which is also the cost of 
7. Therefore, a and 7 must have the same cost. 

This lead to a G Minlnst{fj,). q 
Proof of Theorem 14 

The call to PRUNE on line 01 takes 0{E) time. Note that saving the computed costs is 
essential since r is a graph, not a tree. Therefore, without the check on line 01 of prune, 
the cost of a node may be computed more than once since a node can have more than one 
parent. 

The call to instantiations on line 02 takes 0{\ terms(t) \^E) time where the explana- 
tion is given below. The total time of the extraction algorithm is then 0{\ terms(t) {^E). 

instantiations is similar to prune except that the amount of work done at each node 
f^ is different. To bound this amount, we first observe that for any node fj, with cost{fj,) ^ 00 
and its child v:'^^ 

I TERMS(l^) |<| TERMS(|U) | . (1) 

This follows because: 

- if is an and-node, then TERMs(ju) is the cartesian product of all TERMs(z^) and, 
hence, its cardinality cannot be less than the cardinality of any terms(i^); and 

- if is an or-node, then TERMS(/u) is the union of all extend(terms(z^), Atoms{iJ,, u)) 
and, hence, its cardinality cannot be less than the cardinality of any TERMs(i^).^^ 

Therefore, the computation of terms(/l/) takes 

- 0(1 TERMS I) time if /j, is an and-node, which is also the size of the cartesian 
product 

TERMS(|U) = P TERMS(l^). 

I^eCHILDREN(/^) 

- 0(1 Children(ju) I I TERMs(^) f) time ii ij, is an or-node, which is justified as follows: 

- the union of two sets with sizes n and m takes 0(nm) time. 

- to compute TERMS(/i) we must perform 0(| Children(ju) |) union operations. 

- I TERMS(zv) |<| TERMS(|U) | for all u in Children(|u). 

25. If cost{n) = 00, TERMs(i') = {} and the case is handled by line 02 of instantiations. 

26. Note that | terms(i/) |<| extend(tbrms(j^), ^toms(/u, i/)) |. 
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Therefore, the time to compute TERMs(|u) is 0{\ Children(|u) | | terms(|u) in the worst 
case. The total time taken by instantiations is then 

^0(1 ChILDREN(|U) I I TERMS(|U) p), 

where /j, denotes a node in the NNF-graph r. This reduces to: 

^0(1 Children(ju) I I terms(t) f ) 

since | terms(/l/) |<| terms(t) | for any node jj, in the NNF-graph t?"^ Reducing this 
further, we get: 

I terms(t) ^ 0(1 Children(ju) I) 

= I terms(t) \^0{E) 
= 0(1 terms(t)^ I E). □ 
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